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POPULATION BASED ASSESSMENTS 
AND MEANS TO RANK 
THE RELATIVE IMMUNOGENICITY 
OF PROTEINS 

FIELD OF THE INVENTION 

The present invention provides means to assess immune response profiles of 
populations. In particular, the present invention provides means to qualitatively assess the 
immune response of human populations, wherein the immune response directed against 
any protein of interest is analyzed. The present invention further provides means Jo rank 
proteins based on their relative immunogenicity. In addition, the present invention provides 
means to create proteins with reduced immunogenicity for use in various applications. 

BACKGROUND OF THE INVENTION 

Proteins have the capacity to induce potentially life-threatening immune responses. 
This limitation has hindered their widespread use in consumer end-use applications and 
products. Indeed, this potential to induce immune responses has come to the attention of 
the U.S. Food and Drug Administration (FDA), resulting in the requirement for 
immunogenicity testing both prior to and after approval of new protein therapeutics. 
- However, although there are a number of animal models available for assessing 
immunogenicity, there are no validated methods to discern relative immunogenicity in 
humans. 

Despite these concerns, the immunogenicity of proteins has long been a concern in 
the enzyme manufacturing industry. Occupational exposure to proteins has been 
documented to result in sensitization of industrial and laboratory workers. Sensitization to 
particular proteins is usually assessed by tests such as the skin-prick test that reveals 
whether an individual has mounted an immune response to the protein. 

Indeed, occupational exposure to proteins has been documented to result in 
sensitization of industrial and laboratory workers. In most settings, sensitization is 
controlled by reducing the level of airborne protein (See, Sarlo and Kirchner, Curr. Opin. 
Allergy Clin. Immunol., 2:97-101 [2002]; and Schweigert et a/., Clin. Exp. Allergy 30:1511- 
1518 [2000]). Occupational exposure guidelines have been implemented that control 
airborne exposure to proteins. These guidelines, which provide the allowable level of 
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exposure to particular proteins have been useful in reducing the overall number of 
sensitization events occurring in a given industrial setting. When a new protein is to be 
manufactured, the establishment of occupational exposure guidelines (OEGs) for the new 
protein is a matter of serious concern. A commonly accepted method to determine these 
5 guidelines is the guinea pig intratracheal test (GPIT) (See, Sarlo, Fundam. Appl. Toxicol., 
39:44-52 [1997]). in this test, guinea pigs are exposed to the test protein via intra-tracheal 
instillation for a period of about 10-12 weeks. Serum samples from the animals are taken 
periodically and tested for their levels of antigen-specific antibody by suitable methods 
known in the art (e.g., passive cutaneous testing (PCA) for lgG 1 and by 
10 microimmunodiffusion testing (MID) for precipitating IgG). These results are compared to 
results obtained from a set of guinea pigs tested with control proteins that have known, 
effective exposure guidelines (e.g., ALCALASE® enzyme, commercially available from 
Novo). Determination of serum titers, MID positivity and time to response are considered, 
and a relative potency value is determined. This method has been used successfully to set 
15 OEGs for a number of industrial enzymes. 

However, while the GPIT test is useful, it is time consuming and expensive, 
requiring a number of animals and multiple rounds of testing. Relatively recently, a mouse- 
based test was established that is reported to reproduce the results obtained in the GPIT, 
through the use of a less expensive and less cumbersome animal model. The mouse 
20 intranasal test (MINT; See, Robinson et a/., Toxicol. Sci. 43:39^16 [1998]) is used by some 
companies to set OEG guidelines. However, industry-wide acceptance has not been 
achieved for this model (for reviews of predictive tests for protein allergenicity, see 
Robinson et a/., supra, as well as Kimber et a/., (Kimber et a/., Fundam. Appl. Toxicol., 
33:1-10 [1996]; and Kimber et a/., Toxicol. Sci., 48:157-162 [1999]). 
25 Thus, although animal models are useful, they have limitations. The use of partially 

outbred guinea pigs in the GPIT necessitates the use of large numbers of animals in order 
to achieve statistical significance when comparing responses between groups. In addition, 
inter-experiment variation in control animal responses is very high, which makes potency 
determinations based on a single set of control responses less convincing. The MINT 
30 assay does not suffer from as much variability in antibody responses because the mice 
used are typically BDF1 mice, a cross between two highly inbred mouse strains. While this 
additional level of control allows for more robust data analyses, different strains of mice 
typically return very different potency rankings for similar enzymes (See, Blaikie, Food 
Chem. Toxicol., 37:897-904 [1999]; and Blaikie and Basketter, Food Chem. Toxicol., 
35 37:889-896 [1999]). This is likely due to the specificity of the immune response in a mouse 
line that is been inbred to express very limited MHC molecules. In addition, while data from 
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an individual lab using the MINT assay may be robust, the MINT assay is also plagued by 
inter-laboratory differences. 

Significantly, all animal tests suffer from the inability to provide a suitable 
representation of the immune response to a given protein in humans. Inbred strains of 
mice present peptide molecules with the specificity conferred by their murine MHC 
molecules. Human HLA molecules, while highly related to mouse MHC molecules, do not 
have identical peptide specificities. Furthermore, inbred mouse strains have been selected 
for expression of a single l-A and/or l-E molecule, a situation that very rarely occurs in the 
highly outbred human population. In addition, the mouse immune system has a number of 
properties which are not found in humans (e.g., the Th1 versus Th2 paradigm that has 
been described in mice is much less clear in humans). For example, in humans, there is 
plasticity in Th1 and Th2 phenotypes that can be explained by a genetic inconsistency in 
the IFN-alpha gene. In contrast, in mice, the Th1 and Th2 phenotypes are not dynamic, 
due to an insertion in the IFN-alpha gene in these animals (See, Farrar, Nat. Immunol., 
1 :65-69 [2000]). In addition, humans express HLA class II molecules on activated T cells, 
while mice do not. Furthermore, human donors typically carry endogenous viruses, and 
often have subclinical infections, while laboratory mice are typically maintained in a 
specific-pathogen free (SPF) environment. Another concern is that the C57BI/6 mouse 
strain, a popular background for the creation of transgenic mouse models, carries a defined 
antigen-processing defect that makes comparisons to human derived data of questionable 
reliability (Kim and Jang, Eur. J. Immunol., 22:775-782 [1992]). Human HLA transgenic 
mice have become available for application to the mechanistic study of human immune 
responses (See, Boyton and Altmann, Clin. Exp. Immunol., 127:4-11 [2002]; Black etal.\ J. 
Immunol., 169:5595-5600 [2002]; Raju et a/., Hum. Immunol., 63:237-247 [2002]; and Das 
et aL, Rev. Immunogenei, 2:105-114 [2000]). However, the use of these animals is 
limited, as HLA transgenic mice suffer from species-specific immune system complexities. 
In addition, at least some of the methods used to construct these mice do not allow for 
accurate analysis of peptide-specific responses, as expression of the HLA transgenes is 
not correctly regulated. HLA transgenic mice are often used for mapping studies when 
expressing a single HLA molecule, a situation not found in humans. This is especially of 
note for HLA-DQ transgenic mice where cross-pairing between different HLA-DQ alleles 
has been shown to create new peptide presentation specificities (See, Krco et aL, J. 
Immunol., 163:1661-1665 [1999]). Thus, despite advances in the determination, 
assessment, and comparisons of the immunogenicity of proteins, there remains a need in 
the art for simple, reliable and reproducible methods to make such determinations. 
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Likewise, the application of proteins to therapeutic/industrial and nutritional uses is 
limited by the potential for inducing or exacerbating deleterious immune responses. This 
potential is especially of concern for the use of recombinant human-derived proteins. 
Indeed, recombinant human-derived proteins have been demonstrated to induce immune 
5 responses directed at self-proteins, resulting in the development of autoimmunity (Li et a/., 
Blood 98:3241-3248 [2001]; and Casadell et a/., N. Eng. J. Med.. 346:469-475 [2002]). 
Subsequent reactivation of the immune system after unintended induction of immune 
responses to industrial or food proteins can be minimized by avoidance. However, this is 
not the case with human-derived therapeutic proteins. The selection and/or creation of 

10 reduced immunogenic protein variants is therefore necessary to improve safety and 

efficacy of administered proteins. The selection of a naturally occurring hypo-immunogenic 
protein isomer is an option where several related molecules with similar activities exist. 
Unfortunately, this is not an option for many therapeutic proteins. Thus, there is a long-felt 
need in the art for means to produce hypo-immunogenic proteins suitable for use as 

15 therapeutics and for other applications. 

SUMMARY OF THE INVENTION 

The present invention provides means to assess immune response profiles of 

20 populations. In particular, the present invention provides means to qualitatively assess the 
immune response of human populations, wherein the immune response directed against 
any protein of interest is analyzed. The present invention further provides means to rank 
proteins based on their relative immunogenicity. In addition, the present invention provides 
means to create proteins with reduced immunogenicity for use in various applications. 

25 The present invention was developed In order to avoid the issues arising from 

immunogenicity analyses in animals other than humans. In preferred embodiments of the 
present invention, means are provided to rank the immunogenicity of proteins using human 
peripheral blood monocytes (PBMC) as the test "subject." Because large replicates of 
human samples are used, the information provided is applicable to general populations of 

30 humans. Importantly, the data do not suffer from the specificity issues surrounding the use 
of inbred mice. In preferred embodiments, the present invention provides means to rank 
proteins based on their overall immunogenicity. In addition, by comparing data with pre- 
existing animal data, the methods of the present invention provide information pertaining to 
the relative potency of proteins. For example, during the development of the present 

35 invention, four well-characterized industrial allergens were placed in the order determined 
by the GPIT and MINT tests, and were compared with the results obtained using the 
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methods of the present invention, including determining the sensitization of occupationally 
exposed workers. 

In preferred embodiments, the methods provided by the present invention involve 
the use of dendritic cells as antigen-presenting cells, 15-mer peptides offset by 3 amino 
acids that encompass an entire protein sequence of interest, and CD4 + T-cells obtained 
from the dendritic cell donors. T-cells are allowed to proliferate in a sample in the presence 
of the peptides (each peptide is tested individually) and differentiated dendritic cells. It is 
not intended that any of the methods of the present invention be conducted in any 
particular order, as far as preparation of pepsets and differentiation of dendritic cells. For 
example, in some embodiments, the pepsets are prepared before the dendritic cells are 
differentiated, while in other embodiments, the dendritic cells are differentiated before the 
pepsets are prepared, and in still other embodiments, the dendritic cells are differentiated 
and the pepsets are prepared concurrently. Thus, it is not intended that the present 
invention be limited to methods having these steps in any particular order. 

If the proliferation in response to a peptide results in a stimulation index (SI) of 1 .5 
to 4.5, the response is considered and tallied as being "positive." The results for each 
peptide are tabulated for a donor set, which preferably reflects the general HLA allele 
frequencies of the population, albeit with some variation. The "structure value," based on 
the determination of difference from linearity is determined, and this value is used to rank 
the relative immunogenicity of the proteins. Thus, the present invention provides 
information useful in the modification of proteins, such that reduced response rates 
predicted to be effective in humans are achieved without the need to sensitize volunteers. 
Analyses of donor responses to peptide sets based on these new proteins that have been 
designed to be hypoimmunogenic are then conducted to calculate structure values for the 
new protein(s) and confirm their immunogenicity and exposure potentials. 

In some preferred embodiments, the invention provides an assay system (/.e., the I- 
MUNE® assay) for ranking relative immunogenicity of proteins. In one embodiment, the 
methods comprise rheasuring in vitro CD4 + T-cell proliferation in response to peptide 
fragments of a protein, compiling the measured responses for the protein, determining the 
structure value of the compiled responses, and comparing the structure value of the protein 
to the structure value of a second protein, wherein the protein comprising the lowest 
structure value is ranked as being less immunogenic to a human compared to a protein 
having a higher structure value. In alternative embodiments, the tested protein is an 
enzyme. In still further embodiments, the enzyme is a protease. In an additional 
embodiment, the tested protein is selected from the group consisting of antibodies, 
cytokines, and hormones. In a further embodiment, the T-cell proliferation of each peptide 
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fragment and each protein is determined in side-by-side tests. In other embodiments, a 
"positive" response is determined based on an SI value between 2.7 and 3.2. In 
particularly preferred embodiments, the level of proliferation results in a stimulation index of 
2.95 or greater. 

s The present invention also provides methods for assessing the reduced 

immunogenic capacity of variant proteins in humans. In some embodiments, the methods 
comprise reducing one or more prominent regions of a parent protein to a background level 
to create a variant protein, determining the structure value of the variant, and comparing 
the structure value of the variant with the structure value of the parent protein, wherein the 

10 lower structure value indicates a protein with reduced immunogenicity. In some preferred 
embodiments, the protein is an enzyme. In some alternative embodiments, the protein is 
selected from the group consisting of proteases, cytokines, hormones, antibodies, 
amylases, and other enzymes, including but not limited to subtilisins, ALCALASE® 
enzyme, cellulases, lipases, oxidases, isomerases, kinases, phosphatases, lactamases, 

is and reductases. In further embodiments, the number of prominent regions reduced to 
background level are between 1 and 10, preferably between 1 and 5. In yet another 
embodiment, one or more amino acid residues are altered in the prominent region of the 
parent protein to create a variant. 

The present invention also provides methods for selecting the least immunogenic 

20 protein from a group of related proteins. In one embodiment, the related proteins are 
antibodies, while in an alternative embodiment they are cytokines, and in yet another 
embodiment, they are hormones. In a further embodiment, the related proteins are 
structural proteins. In yet another embodiment, the proteins are enzymes. In some 
preferred embodiments, the enzymes are selected from the group consisting of proteases, 

25 cellulases, lipases, amylases, oxidases, isomerases, kinases, phosphatases, lactamases, 
and reductases. 

The present invention further provides methods of using the relative ranking of 
related proteins to determine T-cell epitope modification suitable to reduce the 
immunogenicity of the proteins, particularly in humans. The present invention also provides 
30 means to categorize proteins based on both their background percent response and their 
structure values. Thus, in some further embodiments, the proteins analyzed are 
categorized and/or ranked according to their background percent response and structure 
values. 

In some embodiments, the present invention provides methods for ranking the 
35 relative immunogenicity of a first protein and at least one additional protein, comprising the 
steps of: (a) preparing a first pepset from a first protein and preparing at least one 
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additional pepset from each of the additional proteins, wherein each of the pepsets (b) 
obtaining from a single human blood source a solution comprising dendritic cells and a 
solution of naive CD4+ and/or CD8+ T-cells; (c) differentiating the dendritic cells to produce 
a solution of differentiated dendritic cells; (d) combining the solution of differentiated 
s dendritic cells and the naive CD4+ and/or CD8+ T-cells with the first pepset; (e) combining 
the solution of differentiated dendritic cells and the naive CD4+ and/or CD8+ T-cells with 
each of the pepsets from the additional proteins; measuring proliferation of the T-cells in 
steps (d) and (e), to determine the responses to each peptide in the first and additional 
pepsets; (g) compiling the responses of the T-cells in step (f) for the first protein and the 
10 additional proteins; (h) determining the structure value of the compiled responses of step 
(g) for the first protein and the additional proteins; and (i) comparing the structure value 
obtained for the first protein with the structure value for the additional proteins to determine 
the immunogenicity ranking of the first protein arid the additional proteins. In some 
preferred embodiments, the pepsets comprise peptides of about 15 amino acids in length, 
15 while in some particularly preferred embodiments each peptide overlaps adjacent peptides 
by about 3 amino acids. However, it is not intended that the peptides within the pepsets be 
limited to any particular length nor overlap, as other peptide lengths and overlap amounts 
find use in the present invention. 

In some embodiments, the protein having the lowest structure value is ranked as 
20 being less immunogenic than the protein having the higher structure value. In additional 
embodiments, the at least two proteins are selected from the group consisting of enzymes, 
hormones, cytokines, antibodies, structural proteins, and binding proteins. In still further 
embodiments, a positive response against the first protein comprises a stimulation index 
value between about 2.7 and about 3.2. In yet other embodiments, a positive response 
25 against the additional proteins comprises a stimulation index value between about 2.7 and 
about 3.2. In further embodiments, a positive response against the first protein comprises 
a stimulation index value between about 2.7 and about 3.2 and a positive response against 
the additional proteins comprises a stimulation index value between about 2.7 and about 
3.2. In some embodiments, proliferation of the T-cells in steps (d) results in a stimulation 
30 index of about 2.95 or greater, while in additional embodiments, the proliferation of the T- 
cells in steps (e) results in a stimulation index of about 2.95 or greater. In still further 
embodiments, the proliferation of the T-cells in steps (d) results in a stimulation index of 
about 2.95 or greater and the proliferation of the T-cells in steps (e) results in a stimulation 
index of about 2.95 or greater. In some particularly preferred embodiments, at least one 
35 additional human blood source is used in step (b). In some additional particularly preferred 
embodiments, the structure values obtained for each of the human blood sources and the 
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proteins are compared. The present invention also provides means to categorize proteins 
based on both their background percent response and their structure values. Thus, in 
some further embodiments, the proteins analyzed are categorized and/or ranked according 
to their background percent response and structure values. 
5 The present invention also provides methods for ranking the relative 

immunogenicity of two proteins, wherein the second protein is a protein variant of the first 
protein, comprising the steps of: (a) preparing a first pepset from a first protein and a 
second pepset from a second protein; (b) obtaining from a single human blood source a 
solution comprising dendritic cells and a solution of naTve CD4+ and/or CD8+ T-cells; (c) 

10 differentiating the dendritic cells to produce a solution of differentiated dendritic cells; (d) 
combining the solution of differentiated dendritic cells and the naTve CD4+ and/or CD8+ T- 
cells with the first pepset; (e) combining the solution of differentiated dendritic cells and the 
naive CD4+ and/or CD8+ T-cells with the second pepset; (f) measuring proliferation of the 
T-cells in steps (d) and (e), to determine the responses to each peptide in the first and 

is second pepsets; (g) compiling the responses of the T-cells in step (f) for the first protein 
and the second protein; (h) determining the structure value of the compiled responses of 
step (g) for the first protein and the second protein; (i) comparing the structure value 
obtained for the first protein with the structure value for the second protein to determine the 
immunogenicity ranking of the first protein and the second protein. In some embodiments, 

20 the second protein is ranked as less immunogenic than the first protein, while in alternative 
embodiments, the first protein is ranked as less immunogenic than the second protein. In 
some preferred embodiments, the pepsets comprise peptides of about 15 amino acids in 
length, while in some particularly preferred embodiments each peptide overlaps adjacent 
peptides by about 3 amino acids. However, it is not intended that the peptides within the 

25 pepsets be limited to any particular length nor overlap, as other peptide lengths and overlap 
amounts find use in the present invention. In additional embodiments, the first and second 
proteins are selected from the group consisting of enzymes, hormones, cytokines, 
antibodies, structural proteins, and binding proteins. In still further embodiments, a positive 
response against the first protein comprises a stimulation index value between about 2.7 

30 and about 3.2, while in other embodiments, a positive response against the second protein 
comprises a stimulation index value between about 2.7 and about 3.2. In additional 
embodiments, a positive response against the first protein comprises a stimulation index 
value between about 2.7 and about 3.2 and a a positive response against the second 
protein comprises a stimulation index value between about 2.7 and about 3.2. In still 

35 further embodiments, the proliferation of the T-cells in steps (d) results in a stimulation 
index of about 2.95 or greater and the proliferation of the T-cells in steps (e) results in a 
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stimulation index of about 2.95 or greater. In some particularly preferred embodiments, at 
least one additional human blood source is used in step (b). In some additional particularly 
preferred embodiments, the structure values obtained for each of the human blood sources 
and the proteins are compared. In some embodiments, the second protein comprises a 
reduction of at least one prominent region in the first protein. In further embodiments, the 
proliferation of the T-cells in step (e) is at a background level. In some particularly 
preferred embodiments, the structure values obtained for each of the human blood sources 
and the proteins are compared. The present invention also provides means to categorize 
proteins based on both their background percent response and their structure values. 
Thus, in some further embodiments, the proteins analyzed are categorized and/or ranked 
according to their background percent response and structure values. 

The present invention also provides methods for ranking the relative 
immunogenicity of a first protein and at least one variant protein, comprising the steps of: 
(a) preparing a first pepset from a first protein and pepsets from each of the variant 
proteins; (b) obtaining from a single human blood source a solution comprising dendritic 
cells and a solution of naive CD4+ and/or CD8+ T-cells; (c) differentiating the dendritic cells 
to produce a solution of differentiated dendritic cells; (d) combining the solution of 
differentiated dendritic cells and the naive CD4+ and/or CD8+ T-cells with the first pepset; 
(e) combining the solution of differentiated dendritic cells and the naTve CD4+ and/or CD8+ 
T-cells with each pepset prepared from each of the variant proteins; (f) measuring 
proliferation of the T-cells in steps (d) and (e), to determine the responses to each peptide 
in the first and second pepsets; (g) compiling the responses of the T-cells in step (f) for the 
first protein and the variant protein(s); (h) determining the structure value of the compiled 
responses of step (g) for the first protein and the variant protein(s); and (i) comparing the 
structure value obtained for the first protein with the structure value for the variant protein(s) 
to determine the immunogenicity ranking of the first protein and the variant proteins. In 
some preferred embodiments, the pepsets comprise peptides of about 15 amino acids in 
length, while in some particularly preferred embodiments each peptide overlaps adjacent 
peptides by about 3 amino acids. However, it is not intended that the peptides within the 
pepsets be limited to any particular length nor overlap, as other peptide lengths and overlap 
amounts find use in the present invention. In some preferred embodiments, at least one of 
the variant proteins is ranked as less immunogenic than the first protein, while in other 
embodiments, the first protein is ranked as less immunogenic than at least one of the 
variant proteins. In additional embodiments, first and the variant proteins are selected from 
the group consisting of enzymes, hormones, cytokines, antibodies, structural proteins, and 
binding proteins. In further embodiments, a positive response against the first protein 
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comprises a stimulation index value between about 2.7 and about 3.2, while in other 
embodiments, a positive response against a variant protein comprises a stimulation index 
value between about 2.7 and about 3.2. In additional embodiments, a positive response 
against the first protein comprises a stimulation index value between about 2.7 and about 
3.2 and a positive response against a variant protein comprises a stimulation index value 
between about 2.7 and about 3.2. In still further embodiments, the proliferation of the T- 
cells in steps (d) results in a stimulation index of about 2.95 or greater and the proliferation 
of the T-cells in steps (e) results in a stimulation index of about 2.95 or greater. In some 
particularly preferred embodiments, at least one additional human blood source is used in 
step (b). In some additional particularly preferred embodiments, the structure values 
obtained for each of the human blood sources and the proteins are compared. In some 
embodiments, the variant protein comprises a reduction of at least one prominent region in 
the first protein. In further embodiments, the proliferation of the T-cells in step (e) is at a 
background level. In some preferred embodiments, the proliferation of the T-cells in step 
(e) for at least one variant protein is at a background level. In some particularly preferred 
embodiments, the structure values obtained for each of the human blood sources and the 
proteins are compared. In further embodiments, at least one additional human blood 
source is used in step (b). The present invention also provides means to categorize 
proteins based on both their background percent response and their structure values. 
Thus, in some further embodiments, the proteins analyzed are categorized and/or ranked 
according to their background percent response and structure values. 

The present invention further provides methods for determining' the immune 
response of a test population against a test protein, comprising the steps of: (a) preparing a 
pepset from a test protein; (b) obtaining a plurality of solutions comprising human dendritic 
cells and a plurality of solutions of naive human CD4+ and/or CD8+ T-cells, wherein the 
solutions of human dendritic cells and solutions of naive human CD4+ and/or CD8+ T-cells 
are obtained from a plurality of individuals within the test population; (c) differentiating the 
dendritic cells to produce a plurality of solutions comprising differentiated dendritic cells; (d) 
combining the plurality of the solutions of differentiated dendritic cells and the solutions of 
naive CD4+ and/or CD8+ T-cells with the pepset, wherein each of the solutions of 
differentiated dendritic cells and the solutions of naive CD4+ and/or CD8+ T-cells are from 
one individual within the test population are combined; (e) measuring proliferation of the T- 
cells in step (d), to determine the responses to each peptide in the pepset; (g) compiling the 
responses of the T-cells in step (e) for the test protein; (h) determining the structure value 
of the compiled responses of step (g) for the test protein; and (i) determining the level of 
exposure of the plurality of individuals to the test protein. In some preferred embodiments, 
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the pepsets comprise peptides of about 15 amino acids in length, while in some particularly 
preferred embodiments each peptide overlaps adjacent peptides by about 3 amino acids. 
However, it is not intended that the peptides within the pepsets be limited to any particular 
length nor overlap, as other peptide lengths and overlap amounts find use in the present 
5 invention. In some embodiments, at least two test proteins are tested. In some preferred 
embodiments, the level of exposure of the plurality of individuals to the test protein is 
compared. In some particularly preferred embodiments, the test protein is modified to 
produce a variant protein that exhibits a reduced immunogenic response in the test 
population. The present invention also provides means to categorize proteins based on 
10 both their background percent response and their structure values. Thus, in some further 
embodiments, the proteins analyzed are categorized and/or ranked according to their 
background percent response and structure values. 

15 BRIEF DESCRIPTION OF THE FIGURES 

Figure 1 illustrates the average frequency of the HLA-DRB1 allele for 184 random 
individuals in the community donor population compared to published "Caucasian" HLA- 
DRB1 populations. 

Figure 2 illustrates the percent of responders from a population of 82 random 
20 individuals tested with peptides derived from Bacillus licheniformis alpha amylase. The 
consecutive 15-mer peptides offset by 3 amino acids are listed on the x-axis and the 
• percentages of donors who responded to each peptide are shown on the y-axis. 

Figure 3 illustrates the percent of responders from a population of 65 random 
individuals tested with peptides derived from Bacillus lentus subtilisin. The consecutive 15- 
25 mer peptides offset by 3 amino acids are listed on the x-axis and the percent of donors who 
responded to each peptide is shown on the y-axis. 

Figure 4 illustrates the percent responders from a population of 1 13 individuals 
tested with two peptide sets from a Bacillus BPN' subtilisin Y217L. The consecutive 15-mer 
peptides offset by 3 amino acids are listed on the x-axis and the percentage of donors who 
30 responded to each peptide are shown on the y-axis. 

Figure 5 illustrates the percent responders from a population of 92 individuals tested 
with peptides derived from ALCALASE® enzyme. The consecutive 15-mer peptides offset 
by 3 amino acids are listed on the x-axis and the percentages of donors who responded to 
each peptide are shown on the y-axis. 
35 Figure 6 provides a graph showing that the calculated structure values decrease 

with increasing number of responses per peptide. The structure values shown were those 
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determined for a-amylase (squares) and BPN' Y217L (diamonds), as responses 
accumulated. 

Figure 7, Panels A and B provide a comparison between GPIT (Panel A) and MINT 
(Panel B) ranking data and the structure index values for four industrial enzymes. The 
5 relative allergenicities of a-amy!ase, ALCALASE® enzyme, BPN' Y217L, and ft lentus 
subtilisin as determined in guinea pig (GPIT) and mouse (MINT)-based assays are 
compared to the structure index values (y-axis). 

Figure 8 provides a graph showing a limited dataset indicating the variant peptide 
responses used to calculate the structure for the BPN 1 Y217L variant. Forty-eight 
10 community donors were tested with peptides derived from the sequence of BPN' Y217L. 
The consecutive 15-mer peptides offset by 3 amino acids are listed on the x-axis and the 
percentages of the donors who responded to each peptide are shown on the y-axis. The 
last two peptides represent variant sequences of peptides number 24 and 37. 

Figure 9 provides a graph showing the maximum proliferative responses of PBMC 
15 from 30 community donors to BPN 1 Y217L (open triangles, structure value = 0.53) and the 
unmodified BPN' Y217L variant (closed squares, structure value = 0.40). Each donor's 
maximum response is shown on the y-axis. An SI of 2.0 was the cut-off for a "positive" 
response. The difference in proliferative responses between BPN' Y217L and the variant 
was p< 0.01. 

20 Figure 10 provides a graph showing the relative structure value and background 

percent of responses to the 25 proteins tested as described in Example 5. 

Figure 1 1 provides a graph showing the average percent response per peptide for 
each of 1 1 tested proteins for the donors tested. 

Figure 12, provides graphs showing the frequency of responses to B. lentus 

25 subtilisin (n=65 community donors). Panel A show the percent of responses to linear 

peptides describing the sequence of subtilisin. The consecutive peptides are shown on the 
x-axis. Percent response within the 65 donors is on the y-axis. Panel B shows the 
frequency of responses within the set. The frequency of responses to the peptides within 
the B. lentus peptide set is shown. 

30 Figure 13 provides a graph showing the responses of seven SPT+ (skin prick test 

positive) donors to 6. lentus peptides. PBMC from 7 donors verified to be sensitized to B. 
lentus subtilisin by skin prick test were used in the l-MUNE® assay of the present invention 
to test for their responses to B. lentus subtilisin peptides. A response to a peptide was 
considered positive if an SI of 2.95 or greater was observed. The number of donors 

35 responding to each peptide is shown on the y-axis. The consecutive B. lentus peptides are 
shown on the x-axis. 
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Figure 14 provides graphs showing l-MUNE® assay data results for staphylokinase. 
Panel A provides the percent responders per peptide (n=72). The consecutive 
staphylokinase peptides are shown on the x-axis. The percent responders within the donor 
set of 72 is shown on the y-axis. Panel shows the frequency of responses per peptide. 

Figure 15 provide a table showing the epitope alignment between the l-MUNE® 
assay results obtained using the l-MUNE® assay system of the present invention and 
published epitopes for staphlylokinase. 

Figure 16 provides graphs showing the l-MUNE® assay results for (32- 
microglobulin. Panel A shows the percent responders per peptide (n=87). The consecutive 
human |32-microglobulin peptides are shown on the x-axis. The percent response within 
the 87 donor set is shown on the y-axis. Panel B shows the frequency of responses per 
peptide. 

Figure 17 provides a table showing the IC50 binding values for epitope peptides 
identified in bacterial proteases by the l-MUNE® assay system of the present invention. 
Values less than 500 nM are considered to be good binders and are highlighted in bold in 
the Table. Degeneracy indicates the number of HLA class II proteins that bind with an IC 50 
of less than 500 nM out of the 1 8 total alleles tested. 

DESCRIPTION OF THE INVENTION 

The present invention provides means to assess immune response profiles of 
populations. In particular, the present invention provides means to qualitatively assess the 
immune response of human populations, wherein the immune response directed against 
any protein of interest is analyzed. The present invention further provides means to rank 
proteins based on their relative immunogenicity. In addition, the present invention provides 
means to create proteins with reduced immunogenicity for use in various applications. 

The present invention provides ex vivo techniques for the identification of CD4+ T- 
cell epitopes on a human population basis. Within a donor population pre-sensitized to the 
protein of interest, all recall epitopes can be defined. For a donor population defined as un- 
sensitized to the protein of interest, either primary or cross-reactive epitopes are identified. 
While the latter cannot be formally ruled out, a number of points support the conclusion that 
the epitopes found are primary epitopes. First, the epitopes found in industrial proteins are 
largely promiscuous binders with low ICso values in an in vitro binding assay. Recall 
responses are marked by lower threshold values over time rather than being narrowed to 
the highest binding values (See, Hesse etal., J. Immunol., 167:1353-1361 [2001]). 
Second, a subset of total recall epitopes is always found when using presumably un- 
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sensitized donors. This is a characteristic of primary, immunodominant epitopes (See, 
Muraro et a/. , J. Immunol., 164:5474-5481 [2000]; Vanderlugt, Nat. Rev. Immunol., 2:85-95 
[2002]; Vanderlugt, J. Immunol., 164:670-678 [2000]; and Yin et a/., J. Immunol., 26:2063- 
2068 [1998]). Third, p-2 microglobulin was tested as a set of 15-mer peptides off-set by 3 
s amino acids, representing a group of 52 peptides to which no prominent epitope responses 
were found. It seems unlikely that none of these sequences would be found to be cross- 
reactive sequences in any other proteins. Four, when a epitope cross-reactive with a 
sequence found in a protein from a human pathogenic agent is found, as was the case for 
one bacterial enzyme protein examined, the percent responses to the epitope peptide were 

10 very high (30%), much higher than any responses collated in the other 10 industrial 

enzymes tested as described in Example 7 (data not shown). Five, the l-MUNE® assay 
system of the present invention is performed using CD4+ T cell enriched responders cells 
and activated monocyte-derived dendritic cells as APCs. The magnitude of proliferative 
responses seen is very small, consistent with a low precursor frequency of antigen-specific 

15 CD4+ T cells. Recall proliferative responses were detected as being much more robust 
than the responses detected in the presumably un-sensitized population. Finally, BLAST 
searches were performed with the epitope sequences. For the Bac///us-derived proteins, 
Bacillus species contain protease variants that have modifications within the epitope 
sequences identified. However, it is unlikely that the donor pool would become sensitized 

20 to these, or any of the other Bacillus serine proteases (with the notable cross-reactive 

example cited above). Interestingly, there is some homology (66% homology) of the amino 
acids 70-84 epitope region in BPN' Y217L to a region in a putative human-derived ATP- 
dependent RNA helicase (See, Imamura et al., Nucl. Acids Res., 26:2063-2068 [1998]). 
Homology to a widely expressed housekeeping gene such as this might be expected to 

25 induce tolerance rather than provoke a cross-reactive response. 

The background rate is an important consideration in analyzing population data. 
The background rate is contributed to by both accumulating positive responses at epitope 
peptides, as well as random events that reach the 2.95 SI cut-off value. The low level of 
randomly accumulating positive responses reflects the heterogeneity of the proliferation 

30 status of CD4+ T cells in human donors (See, Asquith et a/., Trends Immunol., 23:595-601 
[2002]). While the background could be reduced artificially by raising the cut-off response 
value, having a measurable rate of background allows for the determination of where the 
frequency of responses accumulate in a non-random manner. For example, the 
background response rate to HPV16 E6 was significantly higher than the rate for industrial 

35 enzymes, likely reflecting the high prevalence of HPV16 infection in the community donor 
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population (Lazcano-Ponce et aL, Infl J. Cancer 91 :41 2-420 [2001]; and Stone et aL, J. 
Infect. Dis., 186:1396-1402 [2002]). The same situation is likely for staphylokinase. 

In spite of all the variables included in the l-MUNE® assay system, the coefficient of 
variance (CV) for the frequency of epitope responses was very good (an average of 20% 
for four tested peptides). This level of reproducibility compares favorably to coefficient of 
variable values reported for intra-laboratory and inter-donor repeat testing of primary 
ELISPOT data, an analogous ex vivo assay (Keilhoz et aL, J. Immunother., 25:97-138 
[2002]; and Asai et a/., Clin. Diag. Lab Immunol., 7:145-154 [2000]). Generally, CV values 
decline as the percent response to an epitope peptide increases. In addition, non-epitope 
peptide responses with reduced frequencies (usually less than 10% of the donor 
population) have increased CV values. For example, in Example 7, the overall background 
rate was 3.15% with a standard deviation of 1.6%, a CV of 51%. 

The statistical method for defining epitope peptides is different if the population 
demonstrates presentitization to the protein of interest. An increased background response 
is likely due to the reduced threshold for functional activation seen in recall responses (See, 
Hesse et aL, supra). Reduced thresholds for functional activation result in more epitopes 
being detected by the l-MUNE® assay system of the present invention. A comparison of 
the l-MUNE® assay system results with data from sensitized donors showed that the 
prominent epitope responses in the l-MUNE® assay data aligned with epitope responses 
defined by clonal CD4+ T cell lines. By reducing the level of stringency of the statistical 
method, the selection of epitope peptides within the l-MUNE® assay system corresponded 
with the published epitope sequences. The designation of epitope status in datasets with 
very low background rates, such as the industrial enzyme data, was more stringent. When 
the background responses are very low, many peptides accumulate responses that meet 
the cut-off value if the reduced stringency determination is used, but the overall frequency 
of responses is very low, and will be difficult to reproduce. Typically, when responses are 
less than 10% of the total population they become difficult to reproduce due to the technical 
difficultly of testing more than 100 donors. Significant epitope responses are easily 
deduced from the frequency data, where epitope responses are outliers. Epitope peptide 
sequences in unsensitized donors likely reflect tight binding promiscuous epitopes capable 
of inducing de-novo proliferation (Viola and Lanzavecchi, Science 273:104-106 [1996]; and 
Rachmilewitz and Lanzavecchia, Trends Immunol., 23:592-595 [2002]). This was 
confirmed for epitope peptides designated in two industrial enzymes by in vitro peptide 
binding studies (See, Example 7). 

The l-MUNE® assay system of the present invention did not identify any epitopes in 
human (32- microglobulin. This result highlights the difference between the l-MUNE® assay 
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system of the present invention and algorithm-based HLA class II binding prediction 
methods. Peptide-binding algorithms freely available via the internet and known to those in 
the art, predict class II binding epitopes in this sequence. However, as exemplified by the 
results presented here, binding to a class II molecule does not always indicate the 
5 presence of a functional epitope. Binding to HLA class II is necessary, but not sufficient, to 
define T cell epitopes. This is a well-known property of predictive methods, and therefore 
these methods are often supplemented with functional testing. However, the present 
invention provides a more direct means to obtain this information. 

It is important to note that the epitope determinations described herein are defined 

10 on a population basis. While prominent epitopes often show some level of HLA specificity, 
the epitope peptides are largely defined by their promiscuous HLA binding capacity. 
Because of this, these epitopes are likely supertype binders and therefore represent good 
candidates for modification, if a hypo-immunogenic protein is sought However, it is 
contemplated that due to the population based analysis, hypo-immunogenic proteins 

is created using these results as a guide are not always non-immunogenic in every discrete 
instance. Nonetheless, defining T-cell epitopes on a population basis finds use in 
characterization of immune responses to infectious agents (See, Novitsky et a/., J. Virol., 
76:10155-10168 [2002]; and Pathan et a/., J. Immunol., 167:5217-5225 [2001]). One 
purpose for such studies is to design efficacious vaccines, where the inclusion of 

20 promiscuous supertype binders is also warranted. Interestingly, when the data presented in 
one of these studies (Pathan et a/., supra) was subjected to analysis by the exposed-donor 
method defined herein, the same set of dominant epitope responses were selected (data 
not shown). 

In addition to its utility in the infectious disease setting, as well as protein analyses, 
25 the methods of the present invention provide means to localize the functional CD4+ T cell 
epitopes in any protein of interest. When the donor population is expected to be un- 
exposed to the protein of interest, the background response rate is low, and stringent 
statistics can be applied to the selection of CD4+ epitope sequences. Interestingly, human 
proteins have very low background responses. A high background level corresponds with 
30 donor exposure to the protein of interest, and the epitope determination relies on less 
stringent criteria. Epitope designations have been validated by comparison to results for 
verified sensitized donors. As indicated above, no epitopes were found in human 0-2 
microglobulin, as would be expected for a ubiquitously expressed protein that imprints 
tolerance on the immune system. Thus, the present l-MUNE® assay system provides a 
35 valuable tool for predicting population-based CD4+ T-cell epitopes. The applications for 
this technology include the creation of hypo-immunogenic protein variants, the selection of 
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epitope regions for the creation of epitope-based vaccines, and as a tool for inclusion in the 
risk assessment evaluation of all commercial proteins. 

Indeed, the present invention provides means to reduce the sensitization potential 
of CD4+ T-cells. This is particularly of use in target populations that have not been 
previously exposed to a potential commercial protein or any other protein intended for use 
by/for humans and other animals. Indeed, in addition to the creation of hypo- 
allergenic/immunogenic commercial protein variants, T-cell epitope identification is the 
basis of many vaccine strategies (Alexander ef a/., Immunol. Res., 18:79-2 [1998]; and 
Berzofsky, Ann. N.Y. Acad. Sci., 690:256-264 [1993]). The identification of T cell epitopes 
recognized by individuals who clear pathogens versus those who do not is of interest to the 
design of both cancer and viral vaccines (Manici et a/., J. Exp. Med., 189:871-87 [1999]; 
Doolan e/a/., J. Immunol., 165:1123-1137; and Novitsky e/a/., J. Virol., 76:10155-10168 
[2002]). The utility of hypo-allergenic/immunogenic proteins is also clear for personal care, 
health care, and home care settings, as well as in commercial applications. Indeed, such 
hypo-allergenic/immunogenic proteins find use in innumerable settings and uses. 

For the creation of CD4+ T cell epitope-modified proteins, the first critical step is the 
localization of functional epitopes within the protein. There are a number of computer- 
based methods for predicting the localization of peptide sequences that bind to HLA class II 
molecules (Yu et aL, Mol. Med., 8:137-148 [2002]; Rammensee et a/. f Immunogenet., 
50:213-219 [1990]; Sturniolo et a/., Nat. Biotechnol., 17:555-561 [1999]; and Altuvia et a/., 
J. Mol. Biol., 249:244-250 [1995]). Binding to HLA is necessary, but not sufficient, for CD4+ 
T cell activation. Optimally, in vitro and in vivo testing must be performed to confirm 
functionality. Computer based methods are improving in their ability to correctly identify 
tight HLA binders, but still suffer from a lack of prediction for binding non HLA-DR class II 
molecules, and a significant false negative rate. In addition, functional differences such as 
the induction of tolerance, and epitopes that induce differential responses by activated T 
cells cannot be assessed using computer modeling. 

Thus, the present invention provides means heretofore unavailable for the 
identification and confirmation of functionality of methods for assessing CD4+ T-cell 
epitope-modified proteins. In some embodiments, the present invention provides in vitro 
human cell based method for the localization of immunodominant, promiscuous HLA class 
II epitopes from any protein of interest. The method applies equally well to industrial 
enzymes, food allergens, and human therapeutic proteins as it does to the delineation of 
population-based epitope responses to pathogen-derived proteins, as well as any other 
protein of interest. In preferred embodiments, large donor sets are tested without pre- 
selection for HLA type. Epitope determinations are made based on statistical analyses of 
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the response rates by the entire donor set to all the peptides derived from the sequence of 
the protein, and therefore represent population-based epitopes. As indicated herein, the 
methods of the present invention are capable of distinguishing between proteins to which 
the donor population has been exposed, from proteins that the donor population has not 
5 previously encountered or has not become sensitized to. During the development of the 
present invention, both types of analyses were compared to proliferation results from 
verified antigen-sensitized donors. In addition, human (32-microglobulin was tested and 
confirmed as a negative control. 

As referred to herein, epitope peptides are designated by difference from the 

10 background response rate. Epitope peptide responses are reproducible, with a median 
coefficient of variance of 21% when tested on multiple random-donor sets. In addition, as 
discussed in greater detail herein, the l-MUNE® assay system of the present invention 
identified recall epitopes for the protein staphylokinase, and identified immunodominant 
promiscuous epitopes in industrial proteases representing a subset of the total recall 

15 epitopes. Furthermore, the l-MUNE® assay system found no epitopes in the negative 
control (/.e., human p-2 microglobulin). Importantly, the present invention provides means 
to identify functional CD4+ T cell epitopes in any protein without pre-selection for HLA class 
II type, suggesting whether a donor population is pre-exposed to a protein of interest, and 
does not require sensitized donors for in vitro testing. 

20 During the development of the present invention, the use of statistical analysis of 

peptide-specific responses in a large human donor pool provided a metric that ranked four 
industrial enzymes in the order determined by both mouse and guinea pig exposure 
models. The ranking method also compared favorably to human sensitization rates in 
occupational^ exposed workers. Additional confirmation of the methods of the present 

25 invention were also determined, based on structure values for proteins known to cause 
sensitization in humans. Comparison of these results indicated that the sensitization levels 
were found to be higher than the value determined for human p2-microglobulin. In 
preferred embodiments, the present invention provides comparative methods to predict the 
immunogenicity of various related and unrelated proteins in humans. Thus, the information 

30 provided by the present invention finds use in the early development of protein therapies 
and other protein-based applications to select or create reduced immunogenicity variants. 

Definitions 

Unless defined otherwise herein, all technical and scientific terms used herein have 
35 the same meaning as commonly understood by one of ordinary skill in the art to which this 
invention pertains. For example, Singleton and Sainsbury, Dictionary of Microbiology and 
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Molecular Biology, 2d Ed., John Wiley and Sons, NY (1994); and Hale and Marham, The 
Harper Collins Dictionary of Biology, Harper Perennial, NY (1991 ) provide those of skill in 
the art with a general dictionaries of many of the terms used in herein. Although any 
methods and materials similar or equivalent to those described herein find use in the 
practice of the present invention, the preferred methods and materials are described 
herein. Accordingly, the terms defined immediately below are more fully described by 
reference to the Specification as a whole. 

As used herein, the term "population" refers to the individuals associated with, 
and/or residing, in a given area. In some embodiments, the term is used in reference to a 
number of individuals that share a common characteristic (e.g., the population with a 
particular HLA type, etc.). Although the term is used in reference to human populations in 
preferred embodiments, it is not intended that the term be limited to humans, as it finds use 
in reference to other animals and organisms. In some embodiments, the term is used in 
reference to the total set of items, characteristics, individuals, etc., from which a sample is 
taken. 

As used herein, the term "population-based immune response" refers to the immune 
response profiles (i.e., characteristics) of the members of a population. 

As used herein, the term "immune response" refers to the immunological response 
mounted by an organism (e.g., a human or other animal) against an immunogen. It is 
intended that the term encompass all types of immune responses, including but not limited 
to humoral (i.e., antibody-mediated), cellular, and non-specific immune responses. In some 
embodiments, the term reflects the immunity levels of populations (i.e., the number of 
people who are "immune" to a particular antigen and/or the number of people who are "not 
immune" to a particular antigen). 

As used herein, the term "reduced immunogenicity" refers to a reduction in the 
immune response that is observed with variant (e.g., derivative) proteins, as compared to 
the original wild-type (e.g. parental or source) protein. In preferred embodiments of the 
present invention, variant proteins that stimulate a less robust immune response in vitro 
and/or in vivo, as compared to the source protein are provided. It is contemplated that 
these proteins having reduced immunogenicity will find use in various applications, 
including but not limited to byproducts, protein therapeutics, food and feed, personal care, 
detergents, and other consumer-associated products, as well as in other treatment 
regimens, diagnostics, etc. 

As used herein, the term "enhanced immunogenicity" refers to an increase in the 
immune response that is observed with variant (e.g., derivative) proteins, as compared to 
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the original wild-type {e.g. parental or source) protein. In preferred embodiments of the 
present invention, variant proteins that stimulate a more robust immune response in vitro 
and/or in vivo, as compared to the source protein are provided. It is contemplated that 
these proteins having enhanced immunogenicity will find use in various applications, 
5 including but not limited to bioproducts, protein therapeutics, food and feed additives, as 
well as in other treatment regimens, diagnostics, etc. 

As used herein, "allergenic food protein" refers to any food protein that is associated 
with causing an allergic reaction in humans and other animals. A "putative allergenic food 
protein" is a food protein that may be allergenic. A "food protein with reduced allergenicity" 
10 is a food protein that has been modified so as to be less allergenic (i.e., "hypoallergenic") 
than the original, unmodified protein. It is intended that these terms encompass naturally- 
occurring food proteins, as well as those produced synthetically and/or using recombinant 
technology. 

As used herein "altered immunogenic response," refers to an increased or reduced 

15 immunogenic response. Proteins and peptides exhibit an "increased immunogenic 

response" when the T-cell and/or B-cell response they evoke is greater than that evoked by 
a parental (e.g., precursor) protein or peptide (e.g., the protein of interest). The net result 
of this higher response is an increased antibody response directed against the variant 
protein or peptide. Proteins and peptides exhibit a "reduced immunogenic response" when 

20 the T-cell and/or B-cell response they evoke is less than that evoked by a parental (e.g., 
precursor) protein or peptide. The net result of this lower response is a reduced antibody 
response directed against the variant protein or peptide. In some preferred embodiments, 
the parental protein is a wild-type protein or peptide. 

As used herein, "Stimulation Index" (SI) refers to a measure of the T-cell 

25 proliferative response of a peptide compared to a control. The SI is calculated by dividing 
the average CPM (counts per minute) obtained in testing the CD4* T-cell and dendritic cell 
culture containing a peptide by the average CPM of the control culture containing dendritic 
cells and CD4 + T-cells but without the peptides. This value is calculated for each donor 
and for each peptide. While in some embodiments, SI values of between about 1 .5 to 4.5 

30 are used to indicate a positive response, the preferred SI value to indicate a positive 

response is between 2.5 and 3.5, inclusive, preferably between 2.7 and 3.2 inclusive, and 
more preferably between 2.9 and 3.1 inclusive. The most preferred embodiments 
described herein use a SI value of 2.95. 

As used herein, the term "dataset" refers to compiled data for a set of peptides and 

35 a set of donors for tested for their responses against each test protein (i.e., a protein of 
interest). 
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As used herein, the term "pepsef refers to the set of peptides produced for each 
test protein (i.e., protein of interest). These peptides in the pepset (or "peptide sets") are 
tested with cells from each donor. 

As used herein, the terms "Structure" and "Structure Value" refer to a value to rank 
the relative immunogenicity of proteins. The structure value is determined according to the 
"total variation distance to the uniform" formula below: 



fH)-M 

P 

wherein: 

£ (upper case sigma) is the sum of the absolute value of the frequency of 
responses to each peptide minus the frequency of that peptide in the set; /(/) is defined 
as the frequency of responses for an individual peptide; and p is the number of peptides in 
the peptide set. In preferred embodiments of the present invention, a structure value is 
determined for each protein tested. Based on the structure values obtained, the test 
proteins are ranked from the lowest value to the highest value in the series of tested 
proteins. In this ranked series, the lowest value indicates the least immunogenic protein, 
while the highest value indicates the most immunogenic protein. 

The structure value is dependent on the number of donors (/.e M the number of blood 
samples obtained from different individuals) tested. In general, zero responses across the 
entire dataset provide a structure value of 1.0. The same number of responses at each 
peptide returns a structure value of zero. Therefore, in preferred embodiments, a peptide 
set should be tested until there are responses across the majority of the dataset, in order 
for the data to accurately reflect responsivity to particular peptides and peptide regions. In 
particularly preferred embodiments, there is a response to every peptide in the dataset. 
However, some datasets do not exhibit responses to every peptide in the dataset due to 
various factors (e.g., insolubility issues). 

While the above formula is the preferred formula to use for determination of the 
structure value, other equivalent formulas find use in the present invention. For example, 
the "entropy of the distribution" formula finds use in the present invention, as well as 
various other formulae known to those in the art. 

In some embodiments, the peptide sets are tested with at least as many donors as 
should produce a response per peptide given the overall rate of 3% non-specific 
responses. For example, in preferred embodiments, a peptide set of 88 peptides is tested 
with a minimum of 30 donors. Thus, in embodiments in which the pepset includes more 
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peptides, the number of donors is adjusted accordingly. Nonetheless, 30 donors is the 
preferred minimum number. Of course, more donors may be tested using the methods of 
the present invention, even when fewer peptides are present within a pepset. In some 
preferred embodiments, the dataset includes at least 50 donors, in order to provide good 

5 HLA allele representation. 

As used herein, a "prominent response" refers to a peptide that produces an in vitro 
T-cell response rate in the dataset that is greater than about 2.0-fold the background 
response rate. In a further embodiment, the response is about a 2.0-fold to about a 5.0-fold 
increase above the background response rate. Also included within this term are 

10 responses that represent about a 2.5 to 3.5-fold increase, about a 2.8 to 3.2-fold increase, 
and a 2.9 to 3.1-fold increase above the background response rate. For example, during 
the development of the present invention, prominent responses were noted for some of the 
peptides. 

As used herein, "prominent region" refers to an l-MUNE® assay response obtained 

15 with a particular peptide set that is greater than about 2.0-fold the background response 
rate. In one embodiment of the present invention, all of the prominent regions of a protein 
are reduced so that their responses in the l-MUNE® assay system of the present invention 
are reduced. In further embodiments, the number of prominent regions are reduced by 1 , 
2, 3, 4, 5, 6, 7, 8, 9, 10 or more, and preferably between 1 and 5 prominent regions are 

20 reduced in related proteins. In some embodiments, prominent regions also meet the 
requirements for a T-cell epitope. 

The term "sample" as used herein is used in its broadest sense. However, in 
preferred embodiments, the term is used in reference to a sample (e.g., an aliquot) that 
comprises a peptide (e.g., a peptide within a pepset, that comprises a sequence of a 

25 protein of interest) that is being analyzed, identified, modified, and/or compared with other 
peptides. Thus, in most cases, this term is used in reference to material that includes a 
protein or peptide that is of interest. 

As used herein, "background level" and "background response" refer to the average 
percent of responders to any given peptide in the dataset for any tested protein. This value 

30 is determined by averaging the percent responders for all peptides in the set, as compiled 
for all the tested donors. As an example, a 3% background response would indicate that 
on average there would be three positive (SI greater than 2.95) responses for any peptide 
in a dataset when tested on 100 donors. 

As used herein, "antigen presenting cell" ("APC") refers to a cell of the immune 

35 system that presents antigen on its surface, such that the antigen is recognizable by 
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receptors on the surface of T-cells. Antigen presenting cells include, but are not limited to 
dendritic cells, interdigitating cells, activated B-cells and macrophages. 

As used herein, the terms M T lymphocyte" and n T-cell," encompass any cell within 
the T lymphocyte lineage from T-cell precursors (including Thy1 positive cells which have 
5 not rearranged the T cell receptor genes) to mature T cells (/.e., single positive for either 
CD4 or CD8, surface TCR positive cells). 

As used herein, the terms "B lymphocyte" and "B-cell" encompasses any cell within 
the B-cell lineage from B-cell precursors, such as pre-B-cells (B220* cells which have 
begun to rearrange Ig heavy chain genes), to mature B-cells and plasma cells. 
10 As used herein, tt CD4 + T-cell" and tt CD4 T-cell" refer to helper T-cells, while "CD8* 

T-cell" and CD8 T-cell" refer to cytotoxic T-cells. 

As used herein, M B-cell proliferation," refers to the number of B-cells produced 
during the incubation of B-cells with the antigen presenting cells, with or without the 
presence of antigen. 

15 As used herein, "baseline B-cell proliferation," as used herein, refers to the degree 

of B-cell proliferation that is normally seen in an individual in response to exposure to 
antigen presenting cells in the absence of peptide or protein antigen. For the purposes 
herein, the baseline B-cell proliferation level is determined on a per sample basis for each 
individual as the proliferation of B-cells in the absence of antigen. 

20 As used herein, u B-cell epitope," refers to a feature of a peptide or protein which is 

recognized by a B-cell receptor in the immunogenic response to the peptide comprising 
that antigen (/.e., the immunogen). 

As used herein, "altered B-cell epitope," refers to an epitope amino acid sequence 
which differs from the precursor peptide or peptide of interest, such that the variant peptide 

25 of interest produces different (i.e., altered) immunogenic responses in a human or another 
animal. It is contemplated that an altered immunogenic response encompasses altered 
immunogenicity and/or allergenicity (/* e., an either increased or decreased overall 
immunogenic response). In some embodiments, the altered B-cell epitope comprises 
substitution and/or deletion of an amino acid selected from those residues within the 

30 identified epitope. In alternative embodiments, the altered B-cell epitope comprises an 
addition of one or more residues within the epitope. 

u T-cell proliferation," as used herein, refers to the number of T-cells produced during 
the incubation of T-cells with the antigen presenting cells, with or without the presence of 
antigen. 

35 "Baseline T-cell proliferation," as used herein, refers to the degree of T-cell 

proliferation that is normally seen in an individual in response to exposure to antigen 
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presenting cells in the absence of peptide or protein antigen. For the purposes herein, the 
baseline T-cell proliferation level is determined on a per sample basis for each individual as 
the proliferation of T-cells in response to antigen presenting cells in the absence of antigen. 
As used herein, "T-cell epitope" refers to a feature of a peptide or protein which is 
5 recognized by a T-cell receptor in the initiation of an immunogenic response to the peptide 
comprising that antigen (/. e., the immunogen). Although it is not intended that the present 
invention be limited to any particular mechanism, it is generally believed that recognition of 
a T-cell epitope by a T-cell is via a mechanism wherein T-cells recognize peptide fragments 
of antigens which are bound to Class I or Class II MHC (/.e., HLA) molecules expressed on 

10 antigen-presenting cells (See e.g., Moeller, Immunol. Rev., 98:187 [1987]). 

As used herein, "altered T-cell epitope," refers to an epitope amino acid sequence 
which differs from the precursor peptide or peptide of interest, such that the variant peptide 
of interest produces different immunogenic responses in a human or another animal. It is 
contemplated that an altered immunogenic response encompasses altered immunogenicity 

15 and/or allergenicity (i.e., an either increased or decreased overall immunogenic response). 
In some embodiments, the altered T-cell epitope comprises substitution and/or deletion of 
an amino acid selected from those residues within the identified epitope. In alternative 
embodiments, the altered T-cell epitope comprises an addition of one or more residues 
within the epitope. 

20 As used herein, "protein of interest," refers to a protein (e.g., protease) which is 

being analyzed, identified and/or modified. Naturally-occurring, as well as recombinant 
proteins find use in the present invention. Indeed, the present invention finds use with any 
protein against which it is desired to characterize and/or modulate the immunogenic 
response of humans (or other animals). In some embodiments, proteins including 

25 hormones, cytokines, antibodies, enzymes, structural proteins and binding proteins find use 
in the present invention. In some embodiments, hormones, including but not limited to 
insulin, erythropoietin (EPO), thymopoietin (TPO) and luteinizing hormone (LH) find use in 
the present invention. In further embodiments, cytokines including but limited to interferons 
(e.g., IFN-alpha and IFN-beta), interleukins (e.g., IL-1 through IL-15), tumor necrosis 

30 factors (e.g., TNF-alpha and TNF-beta), and GM-CSF find use in the present invention. In 
yet other embodiments, antibodies (i.e., immunoglobulins), including but not limited to 
human and humanized antibodies, antibody-derived fragments (e.g., single chain 
antibodies) of any class, find use in the present invention. In still other embodiments, 
structural proteins including but not limited to food allergens (e.g., Ber e 1 [Brazil nut 

35 allergen] and Ara H 1 [peanut allergen]) find use in the present invention. In additional 
embodiments, the proteins are industrial and/or medicinal enzymes. In some 
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embodiments, preferred classes of enzymes include, but are not limited to proteases, 
cellulases, lipases, esterases, amylases, phenol oxidases, oxidases, permeases, 
pullulanases, isomerases, kinases, phosphatases, lactamases and reductases. 

As used herein, "protein" refers to any composition comprised of amino acids and 
5 recognized as a protein by those of skill in the art. The terms "protein," "peptide" and 
polypeptide are used interchangeably herein. Wherein a peptide is a portion of a protein, 
those skill in the art understand the use of the term in context. The term "protein" 
encompasses mature forms of proteins, as well as the pro- and prepro-forms of related 
proteins. Prepro forms of proteins comprise the mature form of the protein having a 
10 prosequence operably linked to the amino terminus of the protein, and a "pre-" or "signal" 
sequence operably linked to the amino terminus of the prosequence. 

As used herein, "wild-type" and "native" proteins are those found in nature. The 
terms "wild-type sequence," and "wild-type gene" are used interchangeably herein, to refer 
to a sequence that is native or naturally occurring in a host cell. In some embodiments, the 
15 wild-type sequence refers to a sequence of interest that is the starting point of a protein 
engineering project. 

As used herein, "protease" refers to naturally-occurring proteases, as well as 
recombinant proteases. Proteases are carbonyl hydrolases which generally act to cleave 
peptide bonds of proteins or peptides. Naturally-occurring proteases include, but are not 
20 limited to such examples as a-aminoacylpeptide hydrolase, peptidylamino acid hydrolase, 
acylamino hydrolase, serine carboxypeptidase, metallocarboxypeptidase, thiol proteinase, 
carboxylproteinase and metalloproteinase. Serine, metallo, thiol and acid proteases are 
included, as well as endo and exo-proteases. Indeed, in some preferred embodiments, 
serine proteases such as chymotrypsin and subtilisin find use. Both of these serine 
25 proteases have a catalytic triad comprising aspartate, histidine and serine. In the subtilisin 
proteases, the relative order of these amino acids reading from the carboxy terminus is 
aspartate-histidine-serine, while in the chymotrypsin proteases, the relative order of these 
amino acids reading from the carboxy terminus is histidine-asparate-serine. Although 
subtilisins are typically obtained from bacterial, fungal or yeast sources, "subtilisin" as used 
30 herein, refers to a serine protease having the catalytic triad of the subtilisin proteases 
defined above. Additionally, human subtilisins are proteins of human origin having 
subtilisin catalytic activity, for example the kexin family of human derived proteases. 
Subtilisins are well known by those skilled in the art for example, Bacillus amyloliquefaciens 
subtilisin (BPNT), Bacillus lentus subtilisin, Bacillus subtilis subtilisin, Bacillus licheniformis 
35 subtilisin (See e.g., U.S. Patent 4,760,025 (RE 34,606), U.S. Patent 5,204,015, U.S. Patent 
5, 1 85,258, EP 0 328 299, and WO89/06279). 
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As used herein, functionally similar proteins are considered to be "related proteins." 
In some embodiments, these proteins are derived from a different genus and/or species 
{e.g., B. subtilis subtilisin and B. lentus subtilisin), including differences between classes of 
organisms (e.g., a bacterial subtilisin and a fungal subtilisin). In additional embodiments, 
5 related proteins are provided from the same species. Indeed, it is not intended that the 
present invention be limited to related proteins from any source(s). 

As used herein, the term "derivative" refers to a protein (e.g., a protease) which is 
derived from a precursor protein (e.g., the native protease) by addition of one or more 
amino acids to either or both the C- and N-terminal end(s), substitution of one or more 

10 amino acids at one or a number of different sites in the amino acid sequence, and/or 

deletion of one or more amino acids at either or both ends of the protein or at one or more 
sites in the amino acid sequence, and/or insertion of one or more amino acids at one or 
more sites in the amino acid sequence. The preparation of a protease derivative is 
preferably achieved by modifying a DNA sequence which encodes for the native protein, 

15 transformation of that DNA sequence into a suitable host, and expression of the modified 
DNA sequence to form the derivative protease. 

One type of related (and derivative) proteins are "variant proteins." In preferred 
embodiments, variant proteins differ from a parent protein and one another by a small 
number of amino acid residues. The number of differing amino acid residues may be one 

20 or more, preferably 1, 2, 3, 4, 5, 10, 15, 20, 30, 40, 50, or more amino acid residues. In 
one preferred embodiment, the number of different amino acids between variants is 
between 1 and 10. In particularly preferred embodiments, related proteins and particularly 
variant proteins comprise at least 50%, 60%, 65%. 70%, 75%, 80%, 85%, 90%, 95%, 97%, 
98%, or 99% amino acid sequence identity. Additionally, a related protein or a variant 

25 protein as used herein, refers to a protein that differs from another related protein or a 
parent protein in the number of prominent regions. For example, in some embodiments, 
variant proteins have 1, 2, 3, 4, 5, or 10 corresponding prominent regions which differ from 
the parent protein. In one embodiment, the prominent corresponding region of a variant 
produces only a background level of immunogenic response. 

30 As used herein, "corresponding to," refers to a residue at the enumerated position in 

a protein or peptide, or a residue that is analogous, homologous, or equivalent to an 
enumerated residue in another protein or peptide. 

As used herein, "corresponding region" generally refers to an analogous position 
within related proteins or a parent protein. 

35 As used herein, the term "analogous sequence" refers to a sequence within a 

protein that provides similar function, tertiary structure, and/or conserved residues as the 
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protein of interest. In particularly preferred embodiments, the analogous sequence involves 
sequence(s) at or near an epitope. For example, in epitope regions that contain an alpha 
helix or a beta sheet structure, the replacement amino acids in the analogous sequence 
preferably maintain the same specific structure. 

As used herein, "homologous protein" refers to a protein (e.g., protease) that has 
similar catalytic action, structure, antigenic, and/or immunogenic response as the protein 
(e.g., protease) of interest. It is not intended that a homolog and a protein (e.g., protease) 
of interest be necessarily related evolutionarily. Thus, it is intended that the term 
encompass the same functional protein obtained from different species. In some preferred 
embodiments, it is desirable to identify a homolog that has a tertiary and/or primary 
structure similar to the protein of interest, as replacement for the epitope in the protein of 
interest with an analogous segment from the homolog will reduce the disruptiveness of the 
change. Thus, in most cases, closely homologous proteins provide the most desirable 
sources of epitope substitutions. Alternatively, it is advantageous to look to human analogs 
for a given protein. 

As used herein, "homologous genes" refers to at least a pair of genes from different, 
but usually related species, which correspond to each other and which are identical or very 
similar to each other. The term encompasses genes that are separated by speciation (i.e., 
the development of new species) (e.g., orthologous genes), as well as genes that have 
been separated by genetic duplication (e.g., paralogous genes). 

As used herein, "ortholog" and "orthologous genes" refer to genes in different 
species that have evolved from a common ancestral gene (/.e., a homologous gene) by 
speciation. Typically, orthologs retain the same function in during the course of evolution. 
Identification of orthologs finds use in the reliable prediction of gene function in newly 
sequenced genomes. 

As used herein, "paralog" and "paralogous genes" refer to genes that are related by 
duplication within a genome. While orthologs retain the same function through the course 
of evolution, paralogs evolve new functions, even though some functions are often related 
to the original one. Examples of paralogous genes include, but are not limited to genes 
encoding trypsin, chymotrypsin, elastase, and thrombin, which are all serine proteinases 
and occur together within the same species. 

The degree of homology between sequences may be determined using any suitable 
method known in the art (See e.g., Smith and Waterman, Adv. Appl. Math., 2:482 [1981]; 
Needleman and Wunsch, J. Mol. Biol., 48:443 [1970]; Pearson and Lipman, Proc. Natl. 
Acad. Sci. USA 85:2444 [1988]; programs such as GAP, BESTFIT, FASTA, and TFASTA 
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in the Wisconsin Genetics Software Package (Genetics Computer Group, Madison, Wl); 
and Devereux et a/., Nucl. Acid Res., 12:387-395 [1984]). 

For example, PILEUP is a useful program to determine sequence homology levels. 
PILEUP creates a multiple sequence alignment from a group of related sequences using 
s progressive, pairwise alignments. It can also plot a tree showing the clustering 

relationships used to create the alignment. PILEUP uses a simplification of the progressive 
alignment method of Feng and Doolittle, (Feng and Doolittle, J. Mol. Evol., 35:351-360 
[1987]). The method is similar to that described by Higgins and Sharp (Higgins and Sharp, 
CABIOS 5:151-153 [1989]). Useful PILEUP parameters including a default gap weight of 
10 3.00, a default gap length weight of 0.1 0, and weighted end gaps. Another example of a 
useful algorithm is the BLAST algorithm, described by Altschul et a/., (Altschul et a/., J. Mol. 
Biol., 215:403-410, [1990]; and Karlin et a/., Proc. Natl. Acad. Sci. USA 90:5873-5787 
[1993]). One particularly useful BLAST program is the WU-BLAST-2 program (See, 
Altschul et a/., Meth. Enzymol.,, 266:460-480 [1996]). parameters "W," "T," and "X" 
is determine the sensitivity and speed of the alignment. The BLAST program uses as 
defaults a wordlength (W) of 1 1 , the BLOSUM62 scoring matrix (See, Henikoff and 
Henikoff, Proc. Natl. Acad. Sci. USA 89:10915 [1989]) alignments (B) of 50, expectation (E) 
of 10, M'5, ISM, and a comparison of both strands. 

As used herein, "percent (%) nucleic acid sequence identity" is defined as the 
20 percentage of nucleotide residues in a candidate sequence that are identical with the 
nucleotide residues of the sequence. 

As used herein, the term "hybridization" refers to the process by which a strand of 
nucleic acid joins with a complementary strand through base pairing, as known in the art. 

As used herein, "maximum stringency" refers to the level of hybridization that 
25 typically occurs at about Tm-5°C (5°C below the Tm of the probe); "high stringency" at 
about 5°C to 10°C below Tm; "intermediate stringency" at about 10°C to 20°C below Tm; 
and "low stringency" at about 20°C to 25°C below Tm. As will be understood by those of 
skill in the art, a maximum stringency hybridization can be used to identify or detect 
identical polynucleotide sequences while an intermediate or low stringency hybridization 
30 can be used to identify or detect polynucleotide sequence homologs. 

In some embodiments, "equivalent residues" are defined by determining homology 
at the level of tertiary structure for a precursor protein (/.e., protein of interest) whose 
tertiary structure has been determined by x-ray crystallography. Equivalent residues are 
defined as those for which the atomic coordinates of two or more of the main chain atoms 
35 of a particular amino acid residue of the precursor protein and another protein are within 
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0.1 3nm and preferably 0.1 nm after alignment Alignment is achieved after the best model 
has been oriented and positioned to give the maximum overlap of atomic coordinates of 
non-hydrogen protein atoms of the protein. In most embodiments, the best model is the 
crystallographic model giving the lowest R factor for experimental diffraction data at the 
highest resolution available. 

In some embodiments, modification is preferably made to the "precursor DNA 
sequence" which encodes the amino acid sequence of the precursor enzyme, but in 
alternative embodiments, it is made by the manipulation of the precursor protein. In the 
case of residues which are not conserved, the replacement of one or more amino acids is 
limited to substitutions which produce a variant which has an amino acid sequence that 
does not correspond to one found in nature. In the case of conserved residues, such 
replacements should not result in a naturally-occurring sequence. Derivatives provided by 
the present invention further include chemical modification(s) that change the 
characteristics of the protease. 

In some preferred embodiments, the protein gene is ligated into an appropriate 
expression plasmid. The cloned protein gene is then used to transform or transfect a host 
cell in order to express the protein gene. This plasmid may replicate in hosts in the sense 
that it contains the well-known elements necessary for plasmid replication or the plasmid 
may be designed to integrate into the host chromosome. The necessary elements are 
provided for efficient gene expression (e.g., a promoter operably linked to the gene of 
interest). In some embodiments, these necessary elements are supplied as the gene's own 
homologous promoter if it is recognized, (i.e., transcribed by the host), a transcription 
terminator (a polyadenylation region for eukaryotic host cells) which is exogenous or is 
supplied by the endogenous terminator region of the protein gene. In some embodiments, 
a selection gene such as an antibiotic resistance gene that enables continuous cultural 
maintenance of plasmid-infected host cells by growth in antimicrobial-containing media is 
also included. 

In embodiments involving proteases, variant protease activity is determined and 
compared with the protease of interest by examining the interaction of the protease with 
various commercial substrates, including, but not limited to casein, keratin, elastin, and 
collagen. Indeed, it is contemplated that protease activity will be determined by any 
suitable method known in the art. Exemplary assays to determine protease activity include, 
but are not limited to, succinyl-Ala-Ala-Pro-Phe-para nitroanilide (SAAPFpNA) (citation) 
assay; and 2,4,6-trinitrobenzene sulfonate sodium salt (TNBS) assay. In the SAAPFpNA 
assay, proteases cleave the bond between the peptide and p-nitroaniline to give a visible 
yellow color absorbing at 405 nm. In the TNBS color reaction method, the assay measures 
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the enzymatic hydrolysis of the substrate into polypeptides containing free amino groups. 
These amino groups react with TNBS to form a yellow colored complex. Thus, the more 
deeply colored the reaction, the more activity is measured. The yellow color can be 
determined by various analyzers or spectrophotometers known in the art. 
5 Other characteristics of the variant proteases can be determined by methods known 

to those skilled in the art. Exemplary characteristics include, but are not limited to thermal 
stability, alkaline stability, and stability of the particular protease in various substrate or 
buffer solutions or product formulations. 

When combined with the enzyme stability assay procedures disclosed herein, 
10 mutants obtained by random mutagenesis can be identified which demonstrated either 
increased or decreased alkaline or thermal stability while maintaining enzymatic activity. 

Alkaline stability can be measured either by known procedures or by the methods 
described herein. A substantial change in alkaline stability is evidenced by at least about a 
5% or greater increase or decrease (in most embodiments, it is preferably an increase) in 
15 the half-life of the enzymatic activity of a mutant when compared to the precursor protein. 

Thermal stability can be measured either by known procedures or by the methods 
described herein. A substantial change in thermal stability is evidenced by at least about a 
5% or greater increase or decrease (in most embodiments, it is preferably an increase) in 
the half-life of the catalytic activity of a mutant when exposed to a relatively high 
20 temperature and neutral pH as compared to the precursor protein. 

Many of the protein variants of the present invention are useful in formulating 
various compositions for numerous applications, ranging from personal care to industrial 
production. For example, a number of known compounds are suitable surfactants useful in 
detergent compositions comprising the protein mutants of the present invention. These 
25 include nonionic, anionic, cationic, anionic or zwitterionic detergents (See e.g., US Patent 
No 4,404,128, US Patent No. 4,261 ,868, and US Patent No. 5,204,015). Thus, it is 
contemplated that proteins characterized and modified as described herein will find use in 
various detergent applications. Those in the art are familiar with the different formulations 
which find use as cleaning compositions. In addition to typical cleaning compositions, it is 
30 readily understood that the protein variants of the present invention find use in any purpose 
that native or wild-type proteins are used. Thus, these variants can be used, for example, 
in bar or liquid soap applications, dishcare formulations, surface cleaning applications, 
contact lens cleaning solutions and/or products, peptide hydrolysis, waste treatment, textile 
applications, as fusion-cleavage enzymes in protein production, etc. For example, the 
35 variants of the present invention may comprise, in addition to decreased allergenicity, 

enhanced performance in a detergent composition (as compared to the precursor). Indeed, 
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it is not intended that the variants of the present invention be limited to any particular use. 
As used herein, "enhanced performance in a detergent" is defined as increasing cleaning of 
certain enzyme sensitive stains (e.g., grass or blood), as determined by usual evaluation 
after a standard wash cycle. 

In some embodiments, proteins, particularly enzymes, provided by the means of the 
present invention are can be formulated into known powdered and liquid detergents having 
pH between 6.5 and 12.0 at levels of about .01 to about 5% (preferably 0.1% to 0.5%) by 
weight. In some embodiments, these detergent cleaning compositions further include other 
enzymes such as proteases, amylases, cellulases, lipases or endoglycosidases, as well as 
builders and stabilizers. 

The addition of proteins to conventional cleaning compositions does not create any 
special use limitations. In other words, any temperature and pH suitable for the detergent 
are also suitable for the present compositions, as long as the pH is within the above range, 
and the temperature is below the described protein's denaturing temperature. In addition, 
proteins of the invention find use in cleaning compositions without detergents, again either 
alone or in combination with builders and stabilizers. 

In one embodiment, the present invention provides compositions for the treatment 
of textiles that includes variant proteins of the present invention. The composition can be 
used to treat for example silk or wool (See e.g., RE 216,034; EP 134,267; US 4,533,359; 
and EP 344,259). In some embodiments, these variants are screened for proteolytic 
activity according to methods well known in the art. 

As indicated above, in preferred embodiments, the proteins of the present invention 
exhibit modified immunogenic responses (e.g., antigenicity and/or immunogenicity) when 
compared to the native proteins encoded by their precursor DNAs. In some preferred 
embodiments, the proteins (e.g., proteases) exhibit reduced allergenicity. Those of skill in 
the art readily recognize that the uses of the proteases of this invention will be determined, 
in large part, on the immunological properties of the proteins. For example, proteases that 
exhibit reduced immunogenic responses can be used in cleaning compositions. An 
effective amount of one or more protease variants described herein find use in 
compositions useful for cleaning a variety of surfaces in need of proteinaceous stain 
removal. Such cleaning compositions include detergent compositions for cleaning hard 
surfaces, detergent compositions for cleaning fabrics, dishwashing compositions, oral 
cleaning compositions, and denture cleaning compositions. 

An effective amount of one or more related and/or variant proteins with reduced 
allergenicity/immunogenicity, ranked according to the methods of the present invention find 
use in various compositions that are applied to keratinous materials such as nails and hair, 
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including but not limited to those useful as hair spray compositions, hair shampoo and/or 
conditioning compositions, compositions applied for the purpose of hair growth regulation, 
and compositions applied to the hair and scalp for the purpose of treating seborrhea, 
dermatitis, and/or dandruff. 
5 In additional embodiments, effective amount(s) of one or more protease variants) 

described herein find use in included in compositions suitable for topical application to the 
skin or hair. These compositions can be in the form of creams, lotions, gels, and the like, 
and may be formulated as aqueous compositions or may be formulated as emulsions of 
one or more oil phases in an aqueous continuous phase. 
10 In addition, the related and/or variant proteins with reduced 

allergenicity/immunogenicity find use in other applications, including pharmaceutical 
applications, drug delivery applications, and other health care applications. 

DETAILED DESCRIPTION OF THE INVENTION 

15 The present invention provides means to assess immune response profiles of 

populations. In particular, the present invention provides means to qualitatively assess the 
immune response of human populations, wherein the immune response directed against 
any protein of interest is analyzed. The present invention further provides means to rank 
proteins based on their relative immunogenicity. In addition, the present invention provides 

20 means to create proteins with reduced immunogenicity for use in various applications. 
The present invention provides methods to assess the overall immunogenic 
potential of any protein by an analysis of the response rate of individual donors to a set of 
peptides describing the protein of interest. These methods find use in select the least 
immunogenic isomer of related proteins. In addition, these methods find use in guiding the 

25 development of variant proteins with reduced immunogenicity. 

In some preferred embodiments, population-based immune response profiles find 
use in these methods of developing proteins that have reduced immunogenicity. In 
addition, the present invention provides means to determine whether or not a particular 
population has been exposed to a protein of interest, as well as the level of the immune 

30 responses among the individuals in the population. This determination provides 
information useful in the development of proteins with altered immunogenicity 
characteristics that are desired in applications such as bioproducts, food and feed, protein 
therapeutics, personal care, healthcare products, detergents, and other consumer- 
associated goods. 

35 The present invention provides novel means to study the immune responses of 

populations. As indicated herein, potency determinations for applications involving proteins 
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for administration to humans currently utilize non-human animal models. In addition, T-cell 
epitopes determinations based on algorithms do not provide the needed information that is 
provided by the application of the present invention. Indeed, the present invention provides 
means to assess the immune response profiles of individuals, as well as populations, which 
provides important information for the rational design and development of protein- 
containing products. 

By analyzing the background response and the structure value of proteins, the 
immunological "history" of any protein of interest can be determined on a population basis. 
A high background response indicates population pre-exposure (/.e., more than 
approximately 4% of the population exhibits immune response to the protein tested). A 
high structure value indicates a potential immunogen for proteins with low background 
values, and recent, frequent, and "high quality" immune responses when the protein has a 
high background. In some embodiments, "high quality" immune responses are observed, 
due to high levels of immunogen, a robust immune response against the immunogen, 
and/or a response potentiated by a strong adjuvant. 

In some embodiments, low structure values with high backgrounds represent fading 
immune memory responses, infrequent responses in the population, tolerance induction by 
exogenous antigen, and/or responses to proteins that are highly diverse (Le. t which may 
also be a product of a "fading" memory response). It is contemplated that common, non- 
allergenic food proteins are represented in this type of response profile. In addition, 
proteins with low structure values and low backgrounds represent comparatively non- 
immunogenic proteins with no memory response in the population and/or proteins that the 
human population is tolerized against. In some preferred embodiments, proteins with low 
background levels of exposure are modified so as to be made "hypoallergenic" (/.e., they do 
not induce an immune response or induce a lower response, upon exposure to a human or 
other animal). 

To establish a background value for proteins not encountered by the general donor 
population, the l-MUNE® assay was performed on 1 1 industrial enzymes including 
proteases, amylases, laccases, and chitinases (See, Mathies, Tenside Surf. Det., 34:450- 
454 {1997]). One of the proteases was tested twice using peptides produced in two 
different formats (PepSet versus purified peptides from Mimotopes). The number of donors 
tested per peptide set varied from 19 to 1 13. The number of peptides in each peptide set 
varied from 80 to 188. A response was tabulated when the stimulation index (S.I. or SI) for 
an individual peptide was 2.95 or greater. The percent of donors in the tested donor set 
responding to each peptide was calculated. The average percent response per peptide for 
each tested protein was calculated, and is shown graphed versus the number of donors 
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tested (See, Figure 1 1 ). The correlation coefficient was R 2 = 0.86. The slope of the 
correlation reveals the average accumulation rate of responses as 3.01%. Therefore, for 
any given donor tested with peptides derived from industrial proteins, an average of three 
peptides out of 100 will return a positive (SI > 2.95) response. This average response rate 
5 includes both epitope peptides (see below) and the non-epitope peptides. 

Background responses were also calculated by averaging the percent response per 
peptide in the completed dataset. Averaging the background responses for the 12 tests, 
the value is 3.15 +/- 0.45 (average +/- standard error) which is consistent with the value 
determined by the slope of the correlation trendline. 

10 During the development of the present invention, a group of proteins was selected 

based on their presumed exposure in the general human population. These proteins 
included the human papilloma virus (HPV) strain 16 and strain 18 E6 protein, Brazil nut 
allergen Ber e 1 , and staphylokinase. HPV 1 6 and 18 are the most prevalent forms of 
tumorigenic HPV viruses. The level of exposure to these viruses has been estimated to be 

15 5 percent or higher for cross-sectional analyses of young women (Lazcano-Ponce et a/., 
Intl. J. Cancer 91:412-420 [2001]; Stone et a/., J. Infect. Dis., 186:1396-1402 [2002]; 
Goldsborough et a/., Mol. Cell Probes 6:451-457 [1992]; and Lorincz et a/., Obstet. 
Gynecol., 79:328-337 [1992]). This rate varies with locality and age. Brazil nut allergy 
occurs in <1 % of the population, but exposure to Brazil nuts in food is widespread (Sicherer 

20 and Sampson, Curr. Opin. Pediatr., 12:567-573 [2000]). In addition, the rate of 

staphylokinase-specific T-cell responses in human peripheral blood cell cultures increases 
with age, with 30% of young donors responding and greater than 70% of donors over age 
40 responding (Warmerdam et a/., J. Immunol., 168:155-161 [2002]). Peptide sets to these 
four proteins were tested with samples from local community blood banks. The 

25 background responses to all four of these proteins were higher than the average responses 
found in the 1 1 industrial enzymes. This is shown as both a higher overall percent 
background response, and as a higher frequency of responses per peptide as compared to 
the expected values based on data from the 1 1 industrial enzymes from Figure 1 1 . The 
background responses to HPV 16 E6 and staphylokinase were significantly higher. This 

30 result is consistent with the presumed higher exposure rate to these proteins in the donor 
pool. The background responses to HPV 18 E6 and Ber e 1 were higher than the industrial 
protein average, but were not significantly different. The increase in background values as 
compared the industrial protein values is due to the contribution of CD4+ memory 
responses in the donor population that increase the amplitude, number and complexity of 

35 the overall response to a given protein (Kuhns et a/., Proc. Natl. Acad. Sci. USA 97:1271 1- 
12716 [2000]; Muraro et a/., J. Immunol., 164:5474-5481 [2000]; and Vanderlugt and Miller, 
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Nat. Rev. Immunol., 2:85-95 [2002]). Therefore, a higher background rate represents a 
higher level of sensitization to the tested protein. However, it is not intended that the 
present invention be limited to any particular mechanism regarding the overall responses 
against these proteins. For the proteins described herein, it can be concluded that there is 
5 significant exposure of our donor population to HPV 16 E6 and staphylokinase, and less 
exposure to HPV 18 E6 and Ber e 1 . The background responses to Ber e 1 and HPV 18 
E6 are suggestive of exposure to the proteins, but not at the levels of HPV 16 E6 or 
staphylokinase. 

In addition to these proteins, peptide sets describing human proteins were also 
10 tested in during the development of the present invention. These proteins included 
interferon-0 (IFN-p), a cytokine widely expressed during immune responses, 
thrombopoietin (TPO), a cytokine whose expression is restricted to the bone marrow, and a 
soluble recombinant cytokine receptor molecule (tumor necrosis factor receptor-1 ; TNF- 
R1 ). Background responses to all four of these proteins were similar to the industrial 
15 enzyme background data, suggesting that the donors were responding to the peptides in 
these sets as if they were unexposed, or "naive" to these proteins. These data are 
consistent with the ignorance mechanism of peripheral tolerance to these particular 
proteins. 

In additional embodiments, assessment of the T-cell and/or B-cell epitopes 

20 associated with the test proteins is made. In further embodiments, this assessment is 
utilized in developing rational changes in such epitopes to reduce the 
immunogenicity/allergencity of the test proteins (i.e., to produce variant proteins with 
reduced immunogenicity). These variant proteins then find use in various applications, 
including but not limited to byproducts, protein therapeutics, food and feed, personal care, 

25 detergents, and other consumer-associated products, as well as in other treatment 
regimens, diagnostics, etc. 

In preferred embodiments, the method uses dendritic cells as antigen-presenting 
cells, 15-mer peptides offset by 3 amino acids that encompass the entire sequence of the 
protein, and CD4+ T cells from the dendritic cell donors. A "positive" response is tallied if 

30 the average CPM of tritiated thymidine incorporation for a particular peptide is greater than 
or equal to 2.95 times the background CPM. The results for each peptide are tabulated for 
a large donor set that should reflect general HLA allele frequencies (with some variations). 
A statistical calculation based on the determination of "difference from linearity" is 
performed, and this structure value is used to rank the relative immunogenicity of these 

35 proteins. As indicated herein, the ranking results obtained using the methods of the 

present invention closely reflect immunogenicity determinations (i.e., by the MID assay of 
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Sarlo, [1997], supra) and allergenicity of these proteins as respiratory allergens when 
determined in occupationally exposed workers (See, Sarlo, supra), or in the GPIT or MINT 
assay systems (See, Robinson, [1 998]) supra). 

During the development of the present invention, structure values for a set of 
5 proteins including three known immunogens were found to be comparatively high, 
indicating that these proteins might be capable of inducing immune responses in a 
significant number of exposed people. Conversely, the structure value for a mouse VH 36- 
60 gene family member was low, commensurate with its predicted immunogenicity (See, 
Olsson, J. Theor. Biol., 151:111-122 [1991]). Finally, the structure value determined for (32- 

10 microglobulin was low, as would be expected given that this molecule is presumed to be 
subject to both peripheral and central tolerance mechanisms (See, Guery et a/., J. 
Immunol., 154:545-554 [1995]). 

In additional experiments, as described herein, 25 diverse proteins were tested. 
These data provide a framework for validating the present invention; it is not intended that 

is the present invention be limited to these 25 proteins. Indeed, the present invention finds 
use in the analysis of any suitable protein of interest in any suitable population of interest. 
As with the initial experiments described above, the proteins were tested in the l-MUNE® 
assay system described herein, and structure values were determined. For these 25 
proteins, the structure values and background responses delineated four subsets of 

20 proteins with varying attributes of interest among the population tested. The ranking 

method described herein was validated on those proteins with low background responses. 
Furthermore, all of the proteins tested were compared with those having high background 
responses. In addition to ranking the potential immunogenicity of the proteins, these 
embodiments provide information regarding the type of immune response the general 

25 population has mounted against the tested proteins. 

The comparative immunogenicity of proteins tested in the l-MUNE® assay system 
of the present invention assume that proteins would be compared in vivo at the same dose, 
in the same formulation, in a matched set of donors, and over the same dose course. This 
analysis also precludes any processing and/or presentation differences in the proteins, as 

30 well as general physical and structural properties (i.e., stability and activity). 

The present invention provides methods that facilitate the localization of T cell 
epitopes in any protein of interest. For example, in some preferred embodiments, CD4+ T 
cell epitopes are determined in the absence of individuals sensitized to the test protein. 
Thus, modification of the peptide epitopes such that reduced response rates predicted to 

35 be effective in humans are achievable without the need to sensitize volunteers. In some 
embodiments, an analysis of donor responses to the modified peptide variants is used to 
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calculate structure values for the new protein. For example, as shown in Figure 9, a 
protease variant constructed to have a reduced structure value induced significantly less 
proliferation in vitro when compared to the parent protein. 

The present invention provides distinct advantages in determining the 
immunogenicity of proteins. In contrast to the present invention, testing of protein variants 
designed to be less immunogenic by virtue of provoking fewer responses in vitro with large 
replicates of human donors cannot be rationally tested in guinea pigs or mice. Transgenic 
mice are limited in their utility, due to the fact that they typically do not express more than 
one HLA allele, and even then it is often not expressed in a correct context. 

Although the ranking of proteins does not imply any fold potency differences, 
potency differences in guinea pig and mouse models are notoriously inaccurate, 
susceptible to inter-laboratory as well as inter-experiment variability, and are strain 
dependent in mice. Indeed, potency determination in animals, particularly guinea pigs is a 
subjective science, at best. Currently, there is no reliable method to determine potency. 
However, the present invention provides a means to make potency determinations by 
extrapolating data based on the alignment of the data determined using the methods of the 
present method with data obtained from animal experiments. Despite the fact that these 
potency values are subject to the same inherent inaccuracies as the animal data used to 
standardize the structure value results, the present invention provides much-improved 
means to assess immunogenicity, particularly in humans, and determine how best to 
reduce the immunogenicity of proteins. 

Furthermore, the present invention provides means to determine the relative 
immunogenicity of proteins in human subjects (or other animals) without the necessity of 
exposing the subjects to the protein of interest. Thus, there is no risk of sensitizing 
individuals to potentially allergenic/immunogenic substances in order to make the 
determinations. Importantly, the present invention provides means to rank the 
immunogenicity of proteins relative to each other, as well as assess the immune response 
profiles of populations. Indeed, the present invention provides the means to select and/or 
develop reduced immunogenicity proteins and direct the rational modification of proteins, to 
create and test hypo-immunogenic variants that are suitable for use in humans and other 
animals., particularly in humans, 

EXPERIMENTAL 

The following examples serve to illustrate certain preferred embodiments and 
aspects of the present invention and are not to be construed as limiting the scope thereof. 
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In the experimental disclosure which follows, the following abbreviations apply: eq 
(equivalents); M (Molar); pM (micromolar); N (Normal); mol (moles); mmol (millimoles); 
pmol (micromoles); nmol (nanomoles); g (grams); mg (milligrams); kg (kilograms); |jg 
(micrograms); L (liters); ml (milliliters); pi (microliters); cm (centimeters); mm (millimeters); 
5 |jm (micrometers); nm (nanometers); ° C. (degrees Centigrade); h (hours); min (minutes); 
sec (seconds); msec (milliseconds); xg (times gravity); Ci (Curies); OD (optical density); 
Dulbecco's phosphate buffered solution (DPBS); HEPES (N-[2-Hydroxyethyl]piperazine- 
N-[2-ethanesulfonic acid]); HBS (HEPES buffered saline); SDS (sodium dodecylsulfate); 
Tris-HCI (tris[Hydroxymethyl]aminomethane-hydrochloride); Klenow (DNA polymerase I 

10 large (Klenow) fragment); rpm (revolutions per minute); EGTA (ethylene glycol- 

bis(fl-aminoethyl ether) N, N, N\ N'-tetraacetic acid); EDTA (ethylenediaminetetracetic 
acid); SPT+ (skin prick test positive); SPT- (skin prick test negative); ATCC (American Type 
Culture Collection, Rockville, MD); Cedar Lane (Cedar Lane Laboratories, Ontario, 
Canada); Gibco/Life Technologies (Gibco/Life Technologies, Grand Island , NY); Sigma 

15 (Sigma Chemical Co., St. Louis, MO); Pharmacia (Pharmacia Biotech, Piscataway, NJ); 
Procter & Gamble (Procter and Gamble, Cincinnati, OH); Genencor (Genencor 
International, Palo Alto, CA); Endogen (Endogen, Wobum, MA); Cedarlane (Cedariane, 
Toronto, Canada); Dynal (Dynal, Norway); Novo (Novo Industries A/S, Copenhagen, 
Denmark); Biosynthesis (Biosynthesis, Louisville, TX); TriLux Beta, (TriLux Beta, Wallac, 

20 Finland); DuPont/NEN (DuPont/NEN Research Products, Boston, MA); TomTec (Hamden, 
CT); and Stratagene (Stratagene, La Jolla, CA). 

Peptides 

All peptides were obtained from a commercial source (Mimotopes, San Diego, CA). 

25 For the l-MUNE® assay system described herein, 1 5-mer peptides offset by 3 amino acids 
that described the entire sequence of the proteins of interest were synthesized in a multipin 
format (See, Maeji et a/., J. Immunol. Meth., 134:23-33 [1990]). Peptides were 
resuspended in DMSO at approximately 1 to 2 mg/ml, and stored at -70°C prior to use. 
Each peptide was tested at least in duplicate, although for small peptide sets (e.g., Ber e 

30 1 ), the peptides were routinely tested in triplicate. The results for each peptide were 
averaged and the stimulation index (SI) was calculated for each peptide. 

Protein Sequences 

Amino acid sequences from the following well-characterized industrial enzymes 
35 were tested and rank ordered using the methods of the present invention. The sequences 
of these proteins are publicly available from databases such as Medline. The proteins that 
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are described herein in greatest detail include B. lentus subtilisin (Swissprot accession 
number P29600), BPN' Y217L (Swissprot accession number P00782), ALCALASE® 
enzyme (Swissprot accession number P00780), and alpha-amylase (Swissprot accession 
number P06278). 

5 

Human Donor Blood Samples 

Volunteer donor human blood buffy coat samples were obtained from two 
commercial sources (Stanford Blood Center, Palo Alto, CA, and the Sacramento Medical 
Foundation, Sacramento, CA). Buffy coat samples were further purified by density 
10 separation. Each sample was HLA typed for HLA-DR and HLA-DQ using a commercial 
PCR-based kit (Bio-Synthesis). The HLA DR and DQ expression in the donor pool was 
determined to not be significantly different from a North American reference standard (Mori 
et a/., Transplant., 64:1017-1027 [1997]). However, the donor pool did show evidence of 
slight enrichments for ethnicities common to the San Francisco Bay Area. 

15 

Preparation of Dendritic Cells and CD4 + T-Cells 

Monocytes were purified by adherence to plastic in AIM V medium (Gibco/Life 
Technologies). Adherent cells were cultured in AIM V media containing 500 units/ml of 
recombinant human IL-4 (Endogen) and 800 units/ml recombinant human GM-CSF 

20 (Endogen) for 5 days. On day 5, recombinant human IL-1a (Endogen) and recombinant 
human TNF- a (Endogen) were added to 50 units/ml and 0.2 units/ml, respectively. On day 
7, the fully matured dendritic cells were treated with 50ug/ml mitomycin C (Sigma) for 1 
hour at 37°C. Treated dendritic cells were dislodged with 50 mM EDTA in PBS, washed in 
AIM V medium, counted, and resuspended in AIM V media at 2 x 10 5 cells/ml. 

25 CD4 + T-cells were purified by negative selection from frozen aliquots of human 

peripheral blood mononuclear cells (PBMC) using Cellect CD4 columns (Cedarlane). CD4* 
T-cell populations were routinely >80% pure and >95% viable as judged by trypan blue 
(Sigma) exclusion. CD4 + T-cells were resuspended in AIM V media at 2 x 10 6 cells per ml. 

30 l-MUNE® Assay Conditions 

CD4 + T-cells and dendritic cells were plated in round-bottomed 96 well format plates 
at 100ul of each cell mix per well. Peptide was added to a final concentration of 
approximately 5 ug/ml in 0.25-0.5% DMSO. Control wells contained 0.5% DMSO without 
added peptide. Each peptide was tested in duplicate. Cultures were incubated at 37°C, in 

35 5% C0 2 for 5 days. On day 5, 0.5 uCi of tritiated thymidine (NEN DuPont.) was added to 
each well. On day 6, the cultures were harvested onto glass fiber mats using a TomTec 
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manual harvester (TomTec), then processed for scintillation counting. Proliferation was 
assessed by determining the average counts per minute (CPM) value for each set of 
duplicate wells (TriLux Beta). This method is also described in U.S. Patent No. 6,218,165 
and Stickler et a/., J. Immunother. 23: 654-660 (2000), both of which are herein 
s incorporated by reference. 

Data Analysis 

For each individual buffy coat sample, the average CPM values for all of the 
peptides were analyzed. The average CPM values for each peptide were divided by the 

10 average CPM value for the control (DMSO only) wells to determine the "stimulation index" 
(SI). Donors were tested with each peptide set until an average of at least two responses 
per peptide were compiled. The data for each protein was graphed showing the percent 
responders to each peptide within the set. A positive response was collated if the SI value 
was equal to or greater than 2.95. This value was chosen as it approximates a difference 

15 of three standard deviations in a normal population distribution. For each protein assessed, 
positive responses to individual peptides by individual donors were compiled. To determine 
the background response for a given protein, the percent responders for each peptide in 
the set were averaged and a standard deviation was calculated. SI values for each donor 
were compiled for each peptide set, and the percent of responders reported. The average 

20 background response rate for each peptide set was calculated by averaging the percent 
response for all of the peptides in the set. Statistical significance was calculated using 
Poisson statistics for the number of responders to each peptide within the dataset. 
Different statistical methods were used as described herein. The response to a peptide 
was considered significant if the number of donors responding to the peptide was different 

25 from the Poisson distribution defined by the dataset with a p < 0.05. 

Peptide Binding Analysis 

In addition to the above l-MUNE® assay, peptide binding assays were also 
performed. The peptide binding assay used during the development of the present 
30 invention is known in the art (Southwood et al., J. Immunol., 160:3363-3373 [1998]). 
Briefly, HLA-DR and -DQ molecules were purified from a panel of EBV transformed cell 
lines. A competition assay was performed with a characterized standard peptide, and the 
unknown peptide. The amount of unknown peptide required to compete 50% of the 
standard peptide binding was then determined (indicated as the IC 5 o). 
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Statistical Methods* 

Statistical significance of peptide responses were calculated based on Poisson 
statistics. The average frequency of responders was used to calculate a Poisson 
distribution based on the total number of responses and the number of peptides in the set. 
A response was considered significant if p < 0.05. In addition, two-tailed Student's Wests 
with unequal variance, were performed. For epitope determination using data with low 
background response rates, a conservative Poisson based formula was applied: 

,X x e 



f f « -x \\ 
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where n = the number of peptides in the set, x = the frequency 



of responses at the peptide of interest, and A= the median frequency of responses within 
10 the dataset. For epitope determinations based on data with a high background response 



rate, the less stringent Poisson based determination 1 



j 



was used, where A = 



the median frequency of responses in the dataset, and x = the frequency of responses at 
the peptide of interest. 

15 In additional embodiments, the structure determination was calculated based on the 

following formula: 



AMI 



p 

wherein £ (upper case sigma) is the sum of the absolute value of the frequency of 

20 responses to each peptide minus the frequency of that peptide in the set; /(/) is defined 

as the frequency of responses for an individual peptide; and p is the number of peptides in 
the peptide set. 

This equation returns a value between 0 and 2, which is equal to the "Structure 
Value." A value of 0 indicates that the results are completely without structure, and a value 
25 of 2.0 indicates all structure is highly structured around a single area. The closer the value 
is to 2.0, the more immunogenic the protein. Thus, a low value indicates a less 
immunogenic protein. 



HLA Types Within the Donor Pool 

30 HLA-DR and DQ types were analyzed for associations with responses to defined 

epitope peptides. A Chi-squared analysis, with one degree of freedom was used to 
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determine significance. Where an allele was present in both the responder and non- 
responder pools, a relative risk was calculated. 

The HLA-DRB1 allelic expression was determined for approximately 185 random 
individuals. HLA typing was performed using low-stringency PCR determinations. PCR 
reactions were performed as directed by the manufacturer (Bio-Synthesis). The data 
compiled for the Stanford and Sacramento samples were compared the "Caucasian" HLA- 
DRB1 frequencies as published (See, Marsh et a/., HLA Facts Book. The . Academic Press, 
San Diego, CA [2000], page 398, Figure 1). The donor population in these communities is 
enriched for HLA-DR4 and HLA-DR15. However, the frequencies of these alleles in these 
populations are well within the reported range for these two alleles (5.2 to 24.8% for HLA- 
DR4 and 5.7 to 25.6% for HLA-DR15). Similarly, for HLA-DR3, -DR7 and DR11, the 
frequencies are lower than the average Caucasian frequency, but within the reported 
ranges for those alleles. Also of note, HLA0DR15 is found at a higher frequency in ethnic 
populations that are heavily represented in the San Francisco Bay Area. 



EXAMPLE 1 

Compiled Results for Four Known Respiratory Allergens 

In this Example, the results obtained using the l-MUNE® assay and analysis 
methods of the present invention described above, to test four known respiratory allergens 
are described. 

A. Alpha Amylase 

In these experiments, 82 individuals were tested with peptides derived from the 
alpha amylase sequence. The background response to peptides in this set was 2.80 +/- 
3.69%, well within the overall average obtained in tests with 11 industrial enzymes of 3.16 
+/- 1 .57 (data not shown). Prominent responses were noted to amino acids 34-48, 160- 
174, and 442-456 of alpha amylase (See, Figure 2). All three of these responses were 
highly significant above the background response (p < 0.0001). 

B. B. lentus Subtilisin 

In these experiments, 65 individuals were tested with two replicate peptide sets for 
this protein and the results were compiled. The background for this peptide set was found 
to be 3.45 +/- 2.90 %, but within the established range. Prominent responses were noted 
at amino acids 1 60-1 74 (p = 0.0003) (See, Figure 3). 
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C. BPN* Y217L 

In these experiments, 113 individuals were tested with two peptide sets. The 
compiled average for this dataset was 3.62%. Prominent responses were noted at amino 
acids 70-84 and 109-123 (See, Figure 4). A region of responses was also noted around 
amino acid 154. 

D. ALCALASE® Enzyme 

In these experiments, 92 individuals were tested with peptides derived from this 
enzyme. The background response to this protein was found to be low (2.35%). The same 
peptide set was tested in two temporally spaced analyses, and the data were compiled. In 
addition, there were significantly more peptides returning no response within the set for this 
protein. A prominent response was noted at amino acids number 19-33 (p < 0.0001 XSee, 
Figure 5). 

EXAMPLE 2 
Structure Calculations 

This Example describes the structure values obtained for the four enzymes tested. 
Structure values are dependent on the number of donors tested. A zero response rate 
across most of the dataset results in a structure value of -1 .0. The same number of 
responses at each peptide yields a structure value of 0. Therefore, it is important to test a 
peptide set until responses across the majority of the dataset are accumulated, in order for 
the data to accurately reflect responsivity to particular peptides and peptide regions. The 
structure value decreases with increasing numbers of donors tested until a plateau level is 
reached, usually between 2-3 responses per peptide (See, Figure 6). The plateau structure 
value must be used for comparing structure values. 

For each of the enzymes tested, the compiled responses were used to calculate 
structure within the dataset. The structure values were: 0.81 for amylase, 0.72 for 
ALCALASE® enzyme, 0.64 for B. lentus subtilisin, and 0.53 for BPN' Y217L, as shown in 
Table 1. 



Table 1. Structure Determination for Four Respiratory Allergens 



Enzyme 


Peptides 


n 


Responses 
per peptide 


Number of 
epitope 
regions 


Structure 
value 


Amylase 


157 


82 


2.29 


3 


0.81 


B. lentus 
subtilisin 


86 


65 


2.24 


1 


0.64 


ALCALASE® 


88 


92 


2.16 


1 


0.72 


BPN'Y217L 


88 


113 


3.65 


2 


0.53 
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These results indicate that there is more activity induced by the amylase peptide 
set, when CD4+ T cell activation is measured by a level of proliferation resulting in an SI of 
2.95 or greater, as compared to activity measured using the other peptide sets. The result 
5 for BPNT Y217L indicates that the peptide set derived from the sequence of this protein was 
the least active, with the lowest amount of structure. The structure values rank order the 
four tested proteins as: 

amylase >ALCALASE® enzyme>6. /entossubtilisin>BPi\TY217L 

10 EXAMPLE 3 

Comparison to Animal Models 

As indicated above, two animal models have been used for the prediction of 
allergenicity and immunogenicity of industrial proteins. Thus, in this Example, comparisons 
made between these two animal models and the methods of the present invention are 

15 described. Both the guinea pig (GPIT) and BDF1 mouse (MINT) models rank the proteins 
in the order: amylase>ALCALASE® enzyme>B. lentus subtilisin> BPISF Y217L. However, 
the relative values differ. Figure 7 shows the structure values graphed versus the GPIT 
(Panel A) and MINT (Panel B) potency values. Human cell-based structure data obtained 
from using the methods of the present invention indicate a correlation with both methods 

20 (R 2 values of 0.86 and 0.84, respectively). 

EXAMPLE 4 
Structure Values of Additional Proteins 

In this Example, structure values obtained for additional proteins are described. For 
25 example, structure values were calculated for Ber e 1 (/.e., the major allergen found in 
Brazil nuts), human interferon-beta (IFN-p), human thrombopoietin (Tpo), a mouse VH 36- 
60 family member and human ^-microglobulin (See, Table 2). 



Table 2. Structure Values for Selected Additional Proteins 

30 





Peptides 


n 


Average 
Back- 
ground 


Response 
per peptide 


Number of 
epitope 
regions 


Structure 
value 


hTpo 


52 


99 


2.56 


2.54 


1 


0.65 


hlFN-B 


52 


88 


3.17 


2.79 


1 


0.75 


Ber e 1 


27 


92 


4.27 


3.92 


2 


0.66 


Mouse Vh 
36-60 family 


35 


74 


7.0 


5.23 


0 


0.38 


B2- 

microglobulin 


36 


87 


3.9 


3.39 


0 


0.39 
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Human IFN-p, Tpo and Ber e 1 are all known to induce immune responses in 
humans (See, Scagnolari et a/., J. Interferon Cytokine Res., 22:207-213 [2002]; and 
Sichererand Sampson, Curr. Opin. Pediatr., 12:567-573 [2000]; and Li etal., Blood 
98:3241-3248 [2001]). The structure values for IFN-p, Tpo and Ber e 1 are ail 
comparatively high. The value for the mouse VH region is comparatively low, suggesting 
that this protein is comparatively non-immunogenic. This result is consistent with a 
structural analysis of potential immunogenicity of the mouse heavy chain families (See, 
Olsson et a/., [1991] supra). In addition, the result for p2-microglobulin is low, consistent 
with tolerance induction to this ubiquitously expressed protein [Guery et aL, [1995] supra). 



EXAMPLE 5 
Population-Based Immune Responses 

In this Example, experiments conducted to assess the population-based immune 
responses of a population are described. The donor bloods were obtained from Stanford 
and Sacramento, as indicated above, as this population has a distribution that is not 
statistically different from the general "Caucasian" population in the U.S. Samples from the 
these donor bloods were tested in the l-MUNE® assay system described above. The 
structure values were calculated and collated for every protein tested in the l-MUNE® 
assay, for which there were more than two responses per peptide. The proteins tested 
were Ber e 1 (Brazil nut allergen), scFv (single-chain V region of an antibody; the VH and 
VL segments); BLA (p-lactamase); IFN-B (interferon-beta), FNA (subtilisin-BPN' Y217L), 
a-amylase, E6 (human papillomavirus E6 protein in HPV strains 16, 18, 31, and 33), E7 
(HPV E7 protein in HPV strains 16, 18, 31, 33, 45 and 52), eglin (leech protease inhibitor; 
GenBank Accession No. CAA25380); RECK (human protease inhibitor; actually a small 
domain within the 971 amino acid RECK protein [GenBank Accession No. NP_066934] 
was tested; staphylokinase, TPO (human thrombopoeitin), ecotin (serine protease inhibitor 
from E. coli K12; GenBank Accession No. NP_416713; ALCALASE® enzyme, savinase, 
human 0-2 microglobulin, sTNFRI (soluble tumor necrosis factor receptor 1). The results 
of these experiments are shown in Table 3. In this Table, the data indicate how many 
donors responded (/.e., mounted a proliferative response with an SI >2.95) to each peptide 
in the pepset 
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Table3. Results 



Test Protein 


Structure Value 


Response/Peptide 


Background% 


Ber e 1 


0.66 


3.93 


4.26 


scFv 


0.39 


3.96 


4.9 


BLA 


0.56 


2.62 


3.27 


IFN-B 


0.75 


2.79 


3.17 


FNA 


0.65 


3.61 


3.65 


Amylase 


0.81 


2.29 


2.79 


E6 16 


0.72 


3.92 


7.12 


18 


0.79 


2.32 


4 23 


31 


0.53 


3.26 


4.66 


33 


0.68 


1 97 


2 ft^ 


E7 16 


0.66 


3 9 


4 ^3 


18 


0.44 


3 1Q 


J>ww 


31 


0.78 


3 1 


*5 26 


33 


0.54 


2 *55 


4 ^2 


45 


0.76 | 


2 44 




52 


0 59 




o.oo 


Eglin 


0.43 


4.9 


5.57 


RECK 


0.39 


4.1 


4.64 


Staphylokinase 


0.44 


4.48 


6.22 


Tpo 


0.65 


2.24 


2.53 


Ecotin 


0.64 


3.98 


5.69 


Alcalase 


0.72 


2.16 


2.35 


GG36 


0.65 


2.24 


3.45 


3-2 microglobulin 


0.39 


3.38 | 


3.9 


sTNFRI 


0.47 


2.9 


4.2 



Four regions of exposure and immune response level were determined. Figure 10 
5 provides a graph showing the relative structure value and background percent for these 
proteins. Quadrant "1" reflects the number of individuals in the population who have not 
been exposed to the test protein (represented by 1 data point [♦]), while quadrant M 2" 
reflects the number of individuals in the population who have been exposed to the test 
protein, wherein the exposure is more recent, frequent, and/or qualitatively more 
10 immunogenic, quadrant "3" reflects the number of individuals in the population who have 
been tolerized to the test protein and/or the test protein is non-immunogenic, and quadrant 
"4" reflects the number of individuals in the population who have been exposed to the test 
protein, but the exposure was long-past or infrequent, and/or the test protein is qualitatively 
less immunogenic. 

is The proteins in quadrant "1" were BLA, IFN-0, FNA, amylase, HPV33 E6, HPV45 

E7, TPO, ALCALASE® enzyme, and GG36; while the proteins in quadrant M 2" were Ber e 
1, HPV16 E6, HPV18 E6, HPV31 E6, HPV16 E7, HPV31 E7, HPV33 E7, HPV45 E7, and 
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HPV52 E7, and ecotin; the proteins in quadrant u 3* were HPV18 E7, and P-2 microglobulin; 
and the proteins in quadrant "4" were scFv, eglin, RECK, staphylokinase, and sTNFRI . 
Thus, it is clear that the present invention provides means to assess the population-based 
immune responses of any protein. 

5 

EXAMPLE 6 

Creation of Variants with Reduced Structure Values 

In this Example, methods for the creation of variants with reduced structural values 
10 are provided. As an example of how the structure analysis finds use in calculating the 

overall immunogenicity of variant proteins designed to reduce immunogenicity in humans, a 
structure value was calculated for a variant where the prominent responses to amino acids 
70-84 and 109-123 in BPN' Y217L were reduced to background level responses. A limited 
dataset of 48 individuals was tested using peptide variants to the 70-84 and 109-123 
15 regions of BPN 1 Y21 7L Responses to the variants were found to be at background level. 
The complete dataset of 1 13 individuals was modified for structure calculations by reducing 
the responses to 70-84 and 109-123 to background levels. The structure was calculated 
this way in order to predict what the structure value would have been if 1 13 individuals had 
been tested along with the parent molecule. Since responses were removed from the 
20 calculation, an equivalent number of responses were scattered randomly through the 

dataset in order to maintain the same overall rate of response. The structure value for the 
modified protein variant was calculated to be 0.40 (See, Table 4). 

Table 4. Structure Calculations for a Potential Protease Variant 

25 



Protease 


Prominent Epitope 


Structure Value 


BPN' Y217L 


2 


0.53 


BPN' variant 


0 


0.40 



In addition, in vitro data indicated that the protease variant with the lower structure 
value induced less proliferation. In these experiments, PMBC from thirty community 

30 donors were tested parametrically with either the whole protein parent enzyme (BPN* 
Y217L) or the variant protease. The enzymes were inactivated, and tested over a dose 
range from 5 to 40 ug/ml. The highest SI values reached for each protein are shown in 
Figure 9. The parent protease had a structure value of 0.53, and the variant had a 
structure value of 0.40. The difference between optimal SI values for the two proteins 

35 tested on these thirty donors was significant, with a two-tailed parametric t-test value of p < 



BNSDOCID: <WO 03073068A2_I_> 



WO 03/073068 



PCT/US03/05670 



-48- 

0.01 . These results indicate that reducing the structure value from 0.53 to 0.40 has a 
profound effect on the in vitro antigenicity of the molecule. 

In preferred methods of the present invention, when variant proteins are compared 
to a parent protein either in vitro or in vivo, the proteins are preferably compared at the 
s same dose, in the formulation, in a matched set of donors and over the same dose curve. 
The variant proteins should retain the parent protein's general physical and structural 
properties, such as stability and activity. Additionally, the structure analysis precludes any 
processing differences between the parent protein and its variants. 

10 

EXAMPLE 7 
Designation of CD4+ T-cell Epitopes 

In this Example, data from unexposed and exposed donors are presented. These 
data are provided in addition to those in the above Examples. 

15 

Unexposed Donors 

Sixty-five donors were tested with a set of 15-mer peptides synthesized to cover the 
sequence of B. lentus subtilisin. The percent response to each peptide for the 65 donors is 
shown in Figure 12, Panel A. A prominent response at position #54, corresponding to 

20 amino acids 1 60-1 74 is apparent. Another region of prominence is also apparent at 

peptide positions 23 and 31 (amino acids 67-81 and 91-105). The frequency of responses 
to the peptides in the set is shown in Figure 12, Panel B. It is clear that the frequency of 
responses to the peptide at amino acids 160-174 is different than the frequency of 
responses to other peptides in the set. However, the significance of the responses at 

25 amino acids 67-81 and 91-105 must be determined. Significance was determined by 
establishing Poisson distributions for the frequency data then determining the probability 
that a dataset containing the number of values represented by the number of peptides in 
the set would include as its highest member the value in question. For the peptide 
represented by amino acids 160-174, this probability was p = 0.0004. For the other two 

30 peptides, the probability was p = 0.50. 

As a test of the epitope selection criteria, a set of seven donors verified to have 
been exposed to S. lentus subtilisin by skin-prick testing were also tested using the I- 
MUNE® assay system described herein. The number of responses at each peptide is 
shown for all seven donors (See, Figure 13). Only one peptide was found to elicit more 

35 than two responses. The three responders to the amino acids 163-177 peptide included 
both of the HLA-DR2(15) positive donors. An association with response to this peptide and 
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HLA-DR2(15) was noted previously (Stickler et a/., J. Immunother., 23:654-660 [2000]). 
There were two donors that responded to six peptide regions, including the 67-81 region. 
No other peptide from the exposed donor data was prominent in the unexposed donor data 
The 67-81 region has high homology (14/15 amino acid identity) to a known CD4+ T cell 
5 epitope in a related protease, and half of these donors were also SPT+ to this second 
protease. Therefore, as a conservative estimate one verified epitope was found in the 
unexposed donor population, and this epitope is found to be prominent in a set of epitopes 
recognized by verified protein-exposed donors. 

Similar results were observed for another related subtilisin from B. 
10 amyloliquifaciens. Two prominent epitope regions that were highly significant were 

described, and these two epitopes were also found in a set of verified SPT+ donors (data 
not shown). As above, more prominent epitope regions were seen in compiled data from 
exposed donors, and the epitope peptides defined in the unexposed donor set were a 
subset of these. 

15 

Memory Responses 

The l-MUNE® assay described above was performed on a set of peptides derived 
from the sequence of staphylokinase. Staphylokinase was selected for these experiments 

20 due to the fact that the general population accumulates specific responses to this protein 
overtime (See, Warmerdam e/a/., J. Immunol., 168:155-161 [2002]). A set of 72 
community donors was tested in the l-MUNE® assay system of the present invention with 
this protein. The responses to peptides in the staphylokinase set are shown in Figure 14, 
Panel A. There are no clearly prominent responses in the staphylokinase data set. This is 

25 clearly shown in the frequency data (See, Figure 4, Panel B) where, unlike the frequency 
data for B. lentus subtilisin, there are no individual peptides that accumulated responses at 
a rate that was clearly distinct from the distribution of responses to the other peptides. 
However, the prominent response rates at positions 5 (amino acids 13-27), 20 and 21 
(amino acids 58-75), 29 (amino acids 85-99) and 36 (amino acids 106-120) are of interest. 

30 The dataset shows an average response of 4.48 responses per peptide (background = 
6.22%; See, Table 5, below). If this value is used to define the median of a Poisson 
distribution, a less conservative analysis indicates that the response frequencies displayed 
by all of the prominent peptides outlined above are significant (p < 0.05). This analysis is 
much less conservative than the analysis used to assign significance to epitopes found in 

35 the unexposed donors, as the Poisson distribution is defined by the median background 
value, and difference from this value is used to determine significance. 
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Table 5. Background Values for Proteins with Presumed Donor Pre-exposure 





Donors 

tested 


Exnpctpd 
res ponses/peptid 


Rp^Don^p^/npotid 

e found 0 


Backo round 47- 
sd d 


t-test e 


11 industrial 
enzymes 


n.a. a 


n.a. 


n.a. 


3.15+/- 1.57 


n.a. 


HPV16E6 


55 


1.65 


3.92 


7.12+/- 6.48 


P = 0.0003 


HPV 18 E6 


55 


1.65 


2.32 


4.23+/- 4.25 


P = 0.16 


Ber e 1 


92 


2.77 


3.92 


4.26+/- 4.05 


P = 0.22 


staphylo- 
kinase 


72 


2.17 


4.48 


6.22 +/- 3.47 


P = 0.0001 














IFN-beta 


88 


2.65 


2.79 


3.17+/- 3.28 


n.d. 1 


Tpo 


99 


2.99 


2.51 


2.54+/- 2.23 


n.d. 


TNF-R1 


69 


2.08 


1.54 


2.23+/- 1.95 


n.d. 



5 In this Table, V indicates "not applicable"; "b w indicates the expected number of 

responses per peptide for the number of donors tested, based on the data from the 1 1 
industrial proteins shown in Figure 1 1 ; "c" indicates the response per peptide value 
determined experimentally for the protein tested; u d' indicates the background response 
value for the protein tested; "e" indicates the two-tailed, unequal variance West comparing 

10 the background values for the 1 1 industrial enzymes to the background response of the 
protein tested; and T indicates "not determined." 

The five epitope peptides identified in the l-MUNE® assay were compared to 
published epitopes defined using cloned CD4+ T cell lines from donors with antigen- 
specific responses to staphylokinase (See, Figure 15). 

15 The regions defined using cloned T cells from 10 donors, D1, F2, C3, and D4 

contain core sequences (common peptide sequence between the majority of the 
responding clones) that correspond to l-MUNE® assay-identified peptides 5, 20, 21 and 36 
respectively. The l-MUNE® assay identified an epitope peptide at position 29 (amino acids 
85-99) that was not detected using CD4+ T cell clones. This peptide associated with the 

20 presence of HLA-DR5(1 1 ). Only one donor who provided clones for the CD4+ T cell clone 
study carried this allele, and therefore it may have been missed. Alternatively, this peptide 
may not be processed from staphylokinase, and the result would therefore be a false 
positive within the l-MUNE® assay dataset. However, the carboxy terminus of the protein, 
region A5, was previously reported as being recognized by T cell clones (See, Warmerdam 

25 et al., supra). The l-MUNE® assay located an epitope in a subset of the region, peptide 36, 
which corresponded with the adjacent D4 region. Overall, the alignment between the 
epitopes found using the less conservative epitope designation described and the 
published epitopes was excellent. In addition, the HLA associations reported are consistent 
between the two datasets (See, Figure 15). 
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Negative Control 

As a negative control, human p2-microglobulin was also tested in the l-MUNE® 
assay with samples from 87 community donors. This protein was selected as a negative 
control as it is present as part of the HLA class I molecule on the surface of all somatic 
cells. In addition, |32-microglobulin is expressed in the thymus during T cell development. 
Both central and peripheral tolerance mechanisms should affect the T cell repertoire, 
removing any CD4+ T cell with significant cross-reactivity to (32-microglobulin-derived 
peptides (See, Guery et a/., J. Immunol., 154:545-554 [1995]). Finally, there is minimal 
allelic variation in this molecule. One allelic variant was found in a database search (not 
shown). The results are shown in Figure 16. The average background response to 02- 
microglobulin was 3.90 +/- 1.82 percent. The percent responses to the peptides are shown 
in Figure 16, Panel A, and the frequency of responses is shown in Figure 16, Panel B. 
None of the peptide responses were significant based on the statistical method for an 
unexposed donor population with a low background response rate. 

Reproducibility of Response Rates 

The reproducibility of epitope peptide responses was determined by repeat testing 
of epitope peptides. Peptides were synthesized at least twice and were tested on multiple 
discrete groups of donors. The donor number tested for each test ranged from 27 to 103 
donors. The average percent responses to the peptides were compared. The results are 
shown in Table 6. The average coefficient of variance (CV) for the four epitope peptides 
was 20%, and the median value was 21%. The range of CVs was 9.3 to 27%. These 
values compare favorably to other human cell-based ex vivo assays (Keilholz et aL, J. 
Immunother., 25:97-138 [2000]; and Asai et aL, Clin. Diagn. Lab. Immunol., 7:145-154 
[2000]). In Table 6, "s.d." is standard deviation, "s.e." is standard error, and 
u s.d./average*100) B is the percent CV. The average and the median values for the four 
peptides are shown. 



Table 6. Reproducibility of Epitope Peptide Responses 





Number of 
tests 


Average 


s.d. 


s.e. 


%CV 


IFN-B 


3 


16.41 


1.53 


0.88 


9.32 


TPO 


3 


9.18 


1.83 


1.06 


19.99 


BPN*Y217L#24 


4 


11.69 


2.71 


1.35 


23.18 


BPN'Y217L#37 


4 


12.91 


3.51 


1.76 


27.19 










Average for all 


19.92 










Median 


21.59 
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Epitopes Confirmed with Binding Studies 

The IC50 for HLA class II protein binding was determined for peptide epitopes 
defined by the in two related industrial bacterial proteases (See, Figure 17). The peptides 
5 were tested in a competition assay for binding to 18 different HLA-DR and -DQ proteins. 
The prominent epitope in B. lentus subtilisin was found to bind a range of HLA-DR and - 
DQ molecules in two different frames (160-174 and 157-171), indicating promiscuous 
binding. Peptide binding to HLA-DR2(15) was found to be excellent, with an IC50 of 127 
nM. Only HLA-DR 1 displayed a lower IC 50 value. Of the two epitopes defined by the I- 

10 MUNE® assay in 8. amyloliquifaciens subtilisin BPN' Y217L, the second epitope (amino 
acids 109-123) was found to be promiscuous in both the HLA analysis and in the binding 
analysis described in this Example. The first epitope (amino acids 70-84) also binds most 
HLA class II molecules tested, but it binds HLA-DR6(13) with an ICSO of 0.69 nM. This 
likely explains the association seen in the data for a response to this peptide with HLA- 

15 DR6(13) donors (p = 0.00015; relative risk = 7.22, n = 113 donors tested). Those results 
with values less than 500 nM were considered to be good binders and are highlighted in 
bold in Figure 17. Also, in this Figure, degeneracy indicates the number of HLA Class II 
proteins that bind with ah IC 50 of less than 500 nM out of the 18 total alleles tested. 



20 
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CLAIMS 

1 . A method for ranking the relative immunogenicity of a first protein and at least 
one additional protein, comprising the steps of: 

(a) preparing a first pepset from a first protein and preparing at least one 
additional pepset from each of said additional proteins; 

(b) obtaining from a single human blood source a solution comprising 
dendritic cells and a solution of naiVe CD4+ and/or CD8+ T-cells; 

(c) differentiating said dendritic cells to produce a solution of 
differentiated dendritic cells; 

(d) combining said solution of differentiated dendritic cells and said naive 
CD4+ and/or CD8+ T-cells with said first pepset; 

(e) combining said solution of differentiated dendritic cells and said naive 
CD4+ and/or CD8+ T-cells with each of said pepsets from said additional proteins; 

(f) measuring proliferation of said T-cells in said steps (d) and (e), to 
determine the responses to each peptide in said first and additional pepsets; 

(g) compiling the responses of said T-cells in step (f) for said first protein 
and said additional proteins; 

(h) determining the structure value of said compiled responses of step (g) 
for said first protein and said additional proteins; and 

(i) comparing the structure value obtained for said first protein with the 
structure value for said additional proteins to determine the immunogenicity ranking 
of said first protein and said additional proteins. 

2. The method of Claim 1 , wherein the protein having the lowest structure value 
is ranked as less immunogenic than the protein having the higher structure value. 

3. The method of Claim 1 , wherein said at least two proteins are selected from 
the group consisting of enzymes, hormones, cytokines, antibodies, structural proteins, and 
binding proteins. 

4. The method of Claim 1 , wherein a positive response against said first protein 
comprises a stimulation index value between about 2.7 and about 3.2. 

5. The method of Claim 1 , wherein a positive response against said additional 
proteins comprises a stimulation index value between about 2.7 and about 3.2. 
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6. The method of Claim 1 , wherein said proliferation of said T-cells in steps (d) 
results in a stimulation index of about 2.95 or greater. 

7. The method of Claim 1 , wherein said proliferation of said T-cells in steps (e) 
results in a stimulation index of about 2.95 or greater. 

8. The method of Claim 1 , wherein at least one additional human blood source 
is used in step (b). 

9. The method of Claim 8, wherein the structure values obtained for each of 
said human blood sources and said proteins are compared. 

10. The method of Claim 9, wherein the structure values and background percent 
response values of said proteins are used to rank said proteins. 

11. A method for ranking the relative immunogenicity of two proteins, wherein 
said second protein is a protein variant of said first protein, comprising the steps of: 

(a) preparing a first pepset from a first protein and a second pepset from 
a second protein; 

(b) obtaining from a single human blood source a solution of dendritic 
cells and a solution of naive CD4+ and/or CD8+ T-cells; 

(c) differentiating said dendritic cells, in said solution of dendritic cells, to 
produce a solution of differentiated dendritic cells; 

(d) combining said solution of differentiated dendritic cells and said naYve 
CD4+ and/or CD8+ T-cells with said first pepset; 

(e) combining said solution of differentiated dendritic cells and said naVve 
CD4+ and/or CD8+ T-cells with said second pepset; 

(f) measuring proliferation of said T-cells in said steps (d) and (e), to 
determine the responses to each peptide in said first and second pepsets; 

(g) compiling the responses of said T-cells in step (f) for said first protein 
and said second protein; 

(h) determining the structure value of said compiled responses of step (g) 
for said first protein and said second protein; and 

(i) comparing the structure value obtained for said first protein with the 
structure value for said second protein to determine the immunogenicity ranking of 
said first protein and said second protein. 
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12. The method of Claim 11, wherein said second protein is ranked as less 
immunogenic than said first protein. 

13. The method of Claim 11, wherein said first protein is ranked as less 
immunogenic than said second protein. 

14. The method of Claim 1 1 , wherein said at first and second proteins are 
selected from the group consisting of enzymes, hormones, cytokines, antibodies, structural 
proteins, and binding proteins. 

15. The method of Claim 1 1 , wherein a positive response against said first 
protein comprises a stimulation index value between about 2.7 and about 3.2. 

16. The method of Claim 1 1 , wherein a positive response against said second 
protein comprises a stimulation index value between about 2.7 and about 3.2. 

1 7. The method of Claim 1 1 , wherein said proliferation of said T-cells in steps (d) 
results in a stimulation index of about 2.95 or greater. 

18. The method of Claim 1 1 , wherein said proliferation of said T-cells in steps (e) 
results in a stimulation index of about 2.95 or greater. 

1 9. The method of Claim 1 1 , wherein said second protein comprises a reduction 
of at least one prominent region in said first protein. 

20. The method of Claim 1 1 , wherein the proliferation of said T-cells in step (e) is 
at a background level. 

21 . The method of Claim 1 1 , wherein at least one additional human blood source 
is used in step (b). 

22. The method of Claim 1 1 , wherein the structure values obtained for each of 
said human blood sources and said proteins are compared. 

23. The method of Claim 22, wherein the structure values and background 
percent response values of said proteins are used to rank said proteins. 
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24. A method for ranking the relative immunogenicity of a first protein and at least 
one variant protein, comprising the steps of: 

(a) preparing a first pepset from a first protein and pepsets from each of 
said variant proteins; 

(b) obtaining from a single human blood source a solution comprising 
dendritic cells and a solution of naive CD4+ and/or CD8+ T-cells; 

(c) differentiating said dendritic cells to produce a solution of 
differentiated dendritic cells; 

(d) combining said solution of differentiated dendritic cells and said naive 
CD4+ and/or CD8+ T-cells with said first pepset; 

(e) combining said solution of differentiated dendritic cells and said naTve 
CD4+ and/or CD8+ T-cells with each pepset prepared from said variant proteins; 

(f) measuring proliferation of said T-cells in said steps (d) and (e), to 
determine the responses to each peptide in said first pepsets and each pepset from 
said variant proteins; 

(g) compiling the responses of said T-cells in step (f) for said first protein 
and said variant proteins; 

(h) determining the structure value of said compiled responses of step (g) 
for said first protein and said variant proteins; and 

(i) comparing the structure value obtained for said first protein with the 
structure value for said variant proteins to determine the immunogenicity ranking of 
said first protein and said variant proteins. 

25. The method of Claim 24, wherein at least one of said variant proteins is 
ranked as less immunogenic than said first protein. 

26. The method of Claim 24, wherein said first protein is ranked as less 
immunogenic than at least one of said variant proteins. 

27. The method of Claim 24, wherein said at first and said variant proteins are 
selected from the group consisting of enzymes, hormones, cytokines, antibodies, structural 
proteins, and binding proteins. 

28. The method of Claim 24, wherein a positive response against said first 
protein comprises a stimulation index value between about 2.7 and about 3.2. 
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29. The method of Claim 24, wherein a positive response against at least one of 
said variant proteins comprises a stimulation index value between about 2.7 and about 3.2. 

30. The method of Claim 24, wherein said proliferation of said T-ceJIs in steps (d) 
results in a stimulation index of about 2.95 or greater. 

31 . The method of Claim 24, wherein said proliferation of said T-cells in steps (e) 
results in a stimulation index of about 2.95 or greater. 

32. The method of Claim 24, wherein at least one of said variant proteins 
comprises a reduction of at least one prominent region in said first protein. 

33. The method of Claim 32, wherein the proliferation of said T-cells in step (e) 
for at least one variant protein is at a background level. 

34. The method of Claim 22, wherein at least one additional human blood source 
is used in step (b). 

35. The method of Claim 22, wherein the structure values obtained for each of 
said human blood sources and said proteins are compared. 

36. The method of Claim 35, wherein the structure values and background 
percent response values of said proteins are used to rank said proteins. 

37. A method for determine the immune response of a test population against a 
test protein, comprising the steps of: 

(a) preparing a pepset from said test protein; 

(b) obtaining a plurality of solutions comprising human dendritic cells and 
a plurality of solutions of naive human CD4+ and/or CD8+ T-cells, wherein said 
solutions of human dendritic cells and solutions of naive human CD4+ and/or CD8+ 
T-cells are obtained from a plurality of individuals within said test population; 

(c) differentiating said dendritic cells to produce a plurality of solutions 
comprising differentiated dendritic cells; 

(d) combining said plurality of said solutions of differentiated dendritic 
cells and said solutions of naive CD4+ and/or CD8+ T-cells with said pepset, 
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wherein each of said solutions of differentiated dentritic cells and said solutions of 
naive CD4+ and/or CD8+ T-cells are from one individual within said test population 
are combined; 

(e) measuring proliferation of said T-cells in step (d), to determine the 
responses to each peptide in said pepset; 

(g) compiling the responses of said T-cells in step (e) for said test protein; 

(h) determining the structure value of said compiled responses of step (g) 
for said test protein; and 

(i) determining the level of exposure of said plurality of individuals to said 
test protein. 

38. The method of Claim 37, wherein said test protein comprises at least two test 
proteins. 

39. The method of Claim 37, wherein the level of exposure of said plurality of 
individuals to said test protein is compared. 

40. The method of Claim 37, wherein the structure values and background 
percent response values of said proteins are used to rank said proteins. 

41 . The method of Claim 37, wherein said test protein is modified to produce a 
variant protein that exhibits a reduced immunogenic response in said test population. 
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